Web Analytics

Adobe Analytics API 2.0 : Explanation

On this article, I will try to clarify the use-cases and how to use the Adobe Analytics API 2.0. There has been quite some misunderstanding on how to use this API and to be fairly honest, quite some deception about the API capabilities, at least for my side.

Adobe Analytics API 2.0 Context

As a brief introduction, I would like to give some contextual information that are good to know for considering the API 2.0. Those contextual information are mostly private thoughts or things that I put into perspective on on my own. What I try to say is that it is not statement coming from product management from Adobe Analytics. However, when you look at the big picture, I may not be that far from the truth.

Closer to restful API

The Adobe Analytics API latest version that was being used was the 1.4 API that use oauth token to connect to Adobe Analytics capabilities. Usually, Adobe is releasing a new version of the API every 3 years. For this version however, the change came after 4 years. As you could have guessed, it was quite expected that this new version would come.

With every new version, new capabilities are expected and improvement on the usage is researched. What I could imagine is that the Product Management realized that the Adobe Analytics 1.4 API was quite complete in term of feature, however it missed the real time capability. What I call real time is receiving the information as you request it, not the real time streaming data.
The 1.4 API uses a system of queuing for the creation of the report and except for the live-stream API, there is no way to collect data in a real-time fashion, as you do for the report. The live-stream API is quite limited because of all the variable, it can only retrieve props and events, and in that case, it is a streaming of data coming in, no processing happen. Variable that don’t need the 30 mn processing time for visits.

So the first idea was probably to provide an API that is more responsive, full rest API, no queuing and no (big) waiting time.

Workspace update

The 2nd element to know is that Adobe was releasing the Workspace interface at some point and the 1.4 API was really looking like a “Report & Analytics API”, even use by the reportBuilder feature. It didn’t have the full flexibility of what the Workspace report was able to provide.
In order to provide these capabilities for the users, Adobe would need to change their report statement.

The 1.4 API was limited to a certain amount of breakdown and not all of the dimensions were supported. Also the segmentation has been developed quite a bit so you can now use segment as dimension.
To make it short, improvement of the query statement was required to keep track with what the UI was able to provide.

The root of API deception

The previous elements were giving the reasons why the new API was mandatory and you could have guessed what kind of API was coming in the future. As mentioned before, I was personally, quite deceptive when the API was released. To my surprise I was one of the only person I know feeling that the new API was actually a downgrade.
It may be because Clients and new users are actually having what they were expecting from the new API. In that case, it is fine, my use-cases, the way I want to work with the API may not be the one from the masses. However, I also have the feeling that people may not see why I see it as downgrade.
The global architecture of the API is not matching my expectations (but maybe it is matching yours) and will explain why in the following paragraphs. Some are contextual to Adobe technology decisions and some from development decisions.

Adobe Experience Cloud

Let’s do the obvious first, most of the capabilities I was looking for (a query service, complete raw data capabilities, match with existing 1.4 API capabilities) were not possible for the simple reason that the global architecture of Adobe was moving.
During the previous year of the release, the first move from the Experience Cloud architecture was being made. The user management is now integrated into the Admin Console, the API access is managed through Adobe IO.
All of that depopulated the Analytics API from most of their functionalities.

Experience Platform

When the development of the API 2.0 started, we could easily imagine that the Experience Platform was just a project, or a wish-list from Adobe side. No real functional specifications were decided yet.
However, during its 3 years development of the Analytics API 2.0, it is cleared that some concerned were raised on where some capabilities will be build.
Segment Management, or extract of raw data could be managed by both services but at the end, as Adobe works its way to fully integrate these services within Platform, the more interesting functionalities or capabilities were kept for the Experience Platform API.

The previous paragraphs explained why the API was drastically reduced in term of capacities. A simple list comparison can clearly point out the differences:

CapabilityAdobe API 1.4Adobe API 2.0
SegmentsYesYes
ReportsYesYes
Data WarehouseYesNo
LiveStream APIYesNo
ClassificationYesNo
Data Source APIYesNo

The table above is just an extract of the API capabilities but you can start to notice that there is a considerable scope reduction for the new API.

restful API vs graphQL

This element is not that easy to consider because at the time of development for the new Analytics API, graphQL was barely a thing. However, when the API was being finalized it was clear enough that it will be one the last one of its kind. Especially for Analytics.

Instead of a restful architecture, using a graphQL architecture could make lots more sense, as users are only requesting analytics (even if we know it is not exactly this in the background).

Were the developers aware of that when they developed it ? Well, I am not sure. What I am sure of is that the rest method used are… quite strange. Especially regarding the rest of the Adobe IO stack. Most of the Adobe IO API are using JSON data in the post request to generate the response, where the Analytics API 2.0 is mostly (but not only) using query parameters.

It is not a question of aesthetic, but it also implies some limitation for the way you request the data. A JSON format is much more flexible for handling nested information or further development than the query string parameters.

When comparing Launch API and Analytics API, it feels that Launch API would be the best you can ask from a restful API, where Analytics API is just a tentative to use a new request method on top of an old architecture.

Analytics has lots of legacy and I am quite sure that the talent of the developers doing the Analytics API are matching the Launch team developers. It is not their fault, why this API looks old from the start. It is probably the architecture complexity that gives this feeling.

Capabilities decisions

The last point of the deception is very personal. It is really a matter of opinion now and I am just judging it from my Digital Analyst “slash” Machine Learning enthusiast “slash” data science point of view.

It has been decided that the API will provide the user to get Workspace capabilities and fast answer to requests. The queue being dropped is most notable decision to that direction.
Where most people sell the API as the programmatic way to call Workspace and find it very good. I am much more disappointed by that decision.

It would have been a good decision if you had the UI-kind-of-request with the power of programmatic servers to answer. However it is not the case and what is a human limit to request multiple elements in your request become a bottleneck for the API.
Adobe didn’t provide benefit from Workspace to applications, it limited applications to the Workspace capacities.

I explain myself:

The API is not taking care of the (low traffic) value, so there is no advantage of using the API for your report on that point. Fair enough, it may have been too much work on the back end for that capacity.

The API doesn’t handle well large amount of data requested. Due to the removal of the queuing process, it is advised to split your large request into several small one. And to cache it for later purpose.
It has been stated from Adobe that the previous API version (1.4) was handling too much for the end-user, hiding the cost of the request for them. It resulted in large amount of request that drained a lot of computational power for Adobe and badly experiencing the report capabilities of the tools for everyone.

So Adobe decided to open the low level API of Workspace for the user, so the users have to deal with the computational process to request information. This provide transparency over the time it takes to generate some reports. As before, you were waiting for the queue with no idea how long it will take, now you can better identify how long it will take.
That because of 2 reasons :

  • The Analytics API 2.0 is limited by 120 calls per user per minute.
    The API 1.4 were limited to 20K calls per company per hour.
  • The Analytics API request you to realize multiple calls for generating breakdown report (1 per break down)

So depending the number of breakdowns you want to achieve, you can calculate how long it will take to generate your report. Very good answer from Brian Kent Watson here on Adobe Forum here.

On that part I am not agreeing with Adobe with the fact that you now have to feel the pain of the interface to load when you are not using the interface. This limits, in a huge way, the use-cases that can be delivered by the Adobe Analytics API 2.0.

On top of those inherent limitations that I listed above, I found 2 other things, that makes me not a big fan of that API:

  • Documentation issue : I developed for the 1.4 Analytics API and I know that the documentation was not good (now even worse with its move to github).
    But with the 2.0 API, Adobe may need to consider some drastic change on documentation.
    The swagger UI is nice… but don’t provide full explanation, neither correct information.
    When you see endpoint “/segments” per example, what they mean is : “https://analytics.adobe.io/api/{company_id}/segments” 😀
  • Misleading communication : When you start with the API, it is stated that you should “Make multiple, smaller requests instead of a large, single request.” (documentation)
    However, due to the threshold in place (120 calls per minute) there is so much you can do for using multiple break downs.

Use Cases for Analytics API 2.0

That being said, I am not completely against the 2.0 API. I still think that it can be useful for companies to use it but they need to comprehend the use-cases that can be applied for this API. And this was a big miss on our side.
Understanding the context and the architectural decisions of that API should make you aware of the different use cases that you can realize with it.
However, I will still list the use-cases that I feel convenient for this API.

Use CaseAPI 2.0 Compatibility
Fast reporting for top 10 / 100 / 1000 elementsYes
Building Dashboard with top selectionsYes
Listing SegmentsYes
Listing Users Yes
Detect duplicate segmentsYes
Detect deprecated users (log more than 6 month ago)Yes
Monthly Interactive Dashboard generationYes
Large amount of dimension requestNo (1.4 or DW)
Large time frame requestNo (1.4 or DW)
Data Source for Machine Learning No (1.4 or DW or Data Feed)

Those uses case are the one that I feel comfortable or not doing with the new API. If you have any other use-cases that you feel can be cover, feel free to comment on that post.

I hope this article was helping you better understanding the Analytics API 2.0 and how you should properly use it in order to not get frustrated and hit the wall of its limitation.

4 Comments

  1. Very interesting post!

    A use case that I’m exploring is building multi level breakdown report down to the unique user level. Would there be limitations other than multiple api calls for the breakdown report. Some use cases could potentially lead to a million rows. Would it be better to use DW?

    I’m trying to avoid DW due to extremely long processing time about 24hrs and I am hoping by using the 2.0 api will reduce that time by manually doing the breakdown of data.

    1. Hello John,
      Due to the fact that the API has a limitation of 120 request per minute and as you have explained, you would need to have one request (one breakdown) per user.
      That could lead to a big latency time. It would really depends on how many users you are planning to have.
      A very simple calculation for 1 million users (1 000 000), if you succeed to realize 120 request per minute, it would take 8 333 minutes, which is 138 hours.
      I would guess DW is a better way to go, on top of that Adobe has probably better system to handle that sort of big query internally.
      Another problem is that you will have to deal with the (low traffic) variable after 500 K unique item in your unique visitor dimension.
      btw : Through the 1.4 API, you can request DW directly
      btw 2: A Query service will exist with Adobe Experience Platform, I’ll try to cover that in the future.

  2. Great post!

    I’m looking for an option to download the Usage and Access log via API. Is it possible with API 2.0?

    Thanks,
    Parthipan

    1. For the User Management, there is a different API available for that.
      It is called the Admin Console API (I did a wrapper for that one as well).
      You can also retrieved user of Analytics from the Analytics API 2.0 but there are not the same level of details (neither with the Admin Console now).

Leave a Reply

Your email address will not be published. Required fields are marked *