BigBear.ai
  • Home
  • Industries
    • Academia
    • Government
    • Healthcare
    • Manufacturing
  • Solutions
    • Cyber
    • Data Analytics
    • Enterprise Planning and Logistics
    • Intelligent Automation
    • Modeling Solutions
    • Professional Services
  • Company
    • About
    • Investor Relations
    • Partners
    • Team
  • Careers
    • Benefits
    • Culture
    • Explore Jobs
    • Military and Veterans
    • Applicant Login
    • Employee Login
  • Resources
    • Blog
    • Events
    • Newsroom
    • Resource Library
    • Online Store
  • Contact
Search

Home Qlik Tips for Extracting Data Using the Qlik REST Connection

Blog

Tips for Extracting Data Using the Qlik REST Connection

Paul Pirolli
December 1, 2020
  • Share
  • Share

The Qlik REST connector is one of the more powerful tools when it comes to connecting to any Web resource. Pretty much any web database supports REST connections. As a Qlik developer, what are the tools at our disposal to develop these types of connections? Here are the steps I take to develop a Qlik REST connection.

Documentation

Every website that supports REST connections has documentation on how to connect. It will outline authentication (which sometimes requires secret generation), headers, whether a cookie/token must be passed, etc. Before attempting to connect to a website, read up on the documentation & find what endpoints need to be hit and what must be conserved in order to establish trust.

Tools

  • Web search
  • Site Developer’s documentation

Authentication

After reading up on the documentation it is time to make the connection. Depending on the security protocols in place, connecting will be more or less difficult. Sometimes a connection can be made using the Qlik REST connector & we can jump right into extracting without further hassle. However, even after reading the documentation, the site can offer challenges to connecting. Here are some things I have seen:

  • The site requires dynamic parameters
  • The site requires dynamic headers
  • The site requires passing a session-specific cookie or Token
  • A hashing algorithm is required

Except for the last point, all of these requirements are easily handled within Qlik.

My process for establishing a connection is to first try to execute it within Qlik. When that does not work, I move over to Postman so I can test quicker & save my requests (Postman does this automatically). Postman is also easier to inspect my request & the website’s response without having to go into the viewer as I would in Qlik. It also stores cookies in a place I can easily inspect them and even copy them into Qlik if desired.

Once authenticated, assign to variables any part of the reply that is needed for future requests using the Peek( ) function.

Tools

  • Postman
  • Qlik REST connector
  • Qlik Variables

Extracting Data

Time to get to the data! Using the Qlik REST connector, inspect the query response, and insert the script. This will work as the script outline.

Challenges to data extraction using Qlik REST connections:

  • Data required is in several different REST endpoints
  • Some items require unique REST requests passing item ID to get all associated data
  • Requests require pagination
  • Requests require throttling due to server REST API request limits
  • There is a required field that does not appear in the Qlik-generated script

We will break down each of these in the section below.

REST Connection Changes

When making a REST connection, I first thought that I would have to make a different Qlik connector for each endpoint I needed to hit. It was to my great relief that Qlik provides a command to change the connection on the fly with the following command:

Tool

  • “WITH CONNECTION”

Using this command, everything from Headers & Body to the URL of the request can be changed. When I use this tool in development, I will often configure the Qlik REST connection directly for the new endpoint I need to hit to generate the load script. Once I get the system-generated load script, I specify my connection endpoint using WITH CONNECTION and move on to the next table. This solution responds to the first 2 bullet points above.

Example

Let’s apply this to a real-world scenario. I have a website that I am using a REST connection with. I authenticate my connection using the default connection string in my connector (pointing to the authentication URL) and it returns my token. I then use Peek( ) to assign the token to my variable

Let vSiteToken = Peek('siteToken',0,'myAuthTable');

Drop Table myAuthTable;

Now I need to hit the endpoint for my data table. Edit the connection we are using and change the URL to the desired endpoint. I will often use a trace to print out the token on the screen so I can copy & drop it in as a header at this point. Then test the connection, close, and select data. Insert the script below the authentication request. Now we can adjust the very end of the request as follows:

FROM JSON (wrap on) "root" PK "__KEY_root"

WITH CONNECTION (
Url "https://myWebsite.com/rest/NewURL"

HTTPHEADER "tokenHeaderName" "$(vSiteToken)"

);

Now we can use the same connection to hit multiple endpoints. This can be used using a “For Loop” to loop through a list of IDs and concatenating the results together on the same table.

Pagination

While Pagination is supported automatically through the Qlik REST connector setting, it is sometimes necessary to control this manually. This is especially necessary if the requests need to be throttled.

What is throttling? Some REST API connections have a limit of requests that can be made per minute. This is to ensure that one user cannot take more than a reasonable amount of server resources to the detriment of other users.

Tools

  • Response headers
  • Sleep command
  • For loop
  • Response sorting

Example

Let’s say we have a site that returns the data of the REST request in pagination. So we configure the pagination on the connector and execute. After about 15 seconds an error shows up that the server refused the request. We try re-connecting but find that the server is refusing all requests now. Thankfully we have read the documentation though, and recall that there was something about making too many requests and a time-out period if that request limit is breached. So we consult the documentation again and find that the limit is 150 requests/minute. We need to throttle.

While we are waiting for our time-out period to expire, we start setting up variables.

Set vThrottleThreshold = 150; //amount of requests per minute

Let vThrottleTime = 60000 / vThrottleThreshold; //1000*60 milliseconds per minute.

Sleep works in milliseconds. I sometimes subtract about ~30 milliseconds as well since the request takes time. This can be set after some trial & error.

We will also need 2 more variables, 1 for page number, and another for the number of pages. If a request has pagination, often the first request will return with headers outlining what page number it is and how many pages there are. These headers need to be set to those variables for the parameters of the “For Loop”.

Now let’s go over how this will work: We will set our throttle limit, make our first request outside the for loop, set the variables with page info, then jump into our for loop. Once inside, the first action performed is the sleep for our required time, then we will make our request and set the parameter for page 2, and so on. It is important to note here that sending a parameter for sorting may be necessary so that each request returns new data.

Here is what the code may look like:

Set vThrottleThreshold = 150; //amount of requests per minute

Let vThrottleTime = (60000 / vThrottleThreshold) - 30;

myRESTTable:

SQL SELECT ...

FROM JSON (wrap on) ...

WITH CONNECTION(URL "https://mySite.com/rest/item?orderBy=Name");

Let vPageNum = Peek(...);

Let vPageTotal = Peek(...);

For vPageNum to vPageTotal //may need to put variables in $()

Sleep $(vThrottleTime);

Concatentate (myRESTTable)

SQL SELECT ...

FROM JSON (wrap on) ...

WITH CONNECTION(URL "https://mySite.com/rest/item?orderBy=Name&PageNo=$(vPageNum)");

Next vPageNum

//autogenerated script follows here

You will also need to add the page number to keys in the extraction. This will keep keys unique and avoid join errors down the road.

Field Not Found

I ran into this when working with a REST connector for JIRA. Sometimes the fields returned are not intuitively named or the JSON objects never have that field defined in the sample set that is first returned and generates the script. This will involve troubleshooting.

Some Tools

  • Targeted REST extraction
  • Browser object inspector found in browser Developer tools
  • Site REST documentation

These two tools approach the problem differently. The first thing to do is get an example of the sought-for property from the end-user. Consult the REST documentation and see if there an endpoint to extract that item all by itself. Often performing that and inspecting the table in Qlik Data Model Viewer will give us the answer we need.

However, there are times when the response has hundreds of fields and it is like looking for a needle in a haystack. Here I have often changed my approach to looking at the ticket in the web-browser and inspecting the desired field. This sometimes has the field name or some identifier that we can then find in the data model.

Once the field is located, adjust the automatically generated Qlik script to extract the field.

I would like to hear of a case where either of these do not work. I am sure there are examples out there, but I have not seen the example yet.

I hope that you have found these tips useful. REST connections are sometimes challenging to work with, but that is what makes them so much fun. There is nothing more satisfying than getting through a complex REST authentication process and arriving at the data. Good luck in your endeavors, and may you be met with success every step of the way. Happy Qlik-ing.

Posted in Business Intelligence, Qlik.
BigBear.ai
  • Home
  • Industries
  • Solutions
  • Company
  • Careers
  • Blog
  • Investor Relations
  • Contact
  • Twitter
  • Facebook
  • Linkedin
  • Google My business for BigBear.ai
1-410-312-0885
[email protected]
  • Privacy Policy
  • Terms of Use
  • Accessibility
  • Site Map
© BigBear.ai 2023
We value your privacy
We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits. By clicking “Accept”, you consent to the use of ALL the cookies.
Privacy Policy | Do not sell my personal information
AcceptCookie Settings
Manage Consent

Cookies Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. These cookies ensure basic functionalities and security features of the website, anonymously.
CookieDurationDescription
cookielawinfo-checkbox-analytics11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional11 monthsThe cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
JSESSIONIDsessionThe JSESSIONID cookie is used by New Relic to store a session identifier so that New Relic can monitor session counts for an application.
viewed_cookie_policy11 monthsThe cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
Functional
Functional cookies help to perform certain functionalities like sharing the content of the website on social media platforms, collect feedbacks, and other third-party features.
CookieDurationDescription
__atuvc1 year 1 monthAddThis sets this cookie to ensure that the updated count is seen when one shares a page and returns to it, before the share count cache is updated.
__atuvs30 minutesAddThis sets this cookie to ensure that the updated count is seen when one shares a page and returns to it, before the share count cache is updated.
Analytics
Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics the number of visitors, bounce rate, traffic source, etc.
CookieDurationDescription
_ga2 yearsThe _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_ga_NK4L4Q320Q2 yearsThis cookie is installed by Google Analytics.
_gat_gtag_UA_163894009_21 minuteSet by Google to distinguish users.
_gid1 dayInstalled by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
at-randneverAddThis sets this cookie to track page visits, sources of traffic and share counts.
CONSENT2 yearsYouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.
uvc1 year 1 monthSet by addthis.com to determine the usage of addthis.com service.
Advertisement
Advertisement cookies are used to provide visitors with relevant ads and marketing campaigns. These cookies track visitors across websites and collect information to provide customized ads.
CookieDurationDescription
f5avraaaaaaaaaaaaaaaa_session_sessionbusinesswire.com cookie
loc1 year 1 monthAddThis sets this geolocation cookie to help understand the location of users who share the information.
VISITOR_INFO1_LIVE5 months 27 daysA cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSCsessionYSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devicesneverYouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-idneverYouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
Save & Accept