Skip to main content

Sift Audience Data Query Language (DQL) – 1


I had been thinking long and hard on enhancing the Audience Management capability in Sift. Everybody seems to have it - from the mighty big players to the obscure ones. In such a crowded space, it is of prime importance to have a very clear and a striking "unique selling proposition" before you boast that you have this capability too. Basically, the tool shall solve a problem that was very challenging previously, in a cost effective way that was not thought of commonly. Now, Sift is a real time analytics engine. So obviously, the profiles it builds about individual entities are accurate - up to the second. That gets a lot of brownies to start with. And as Sift would limitlessly process data, the data points it can compute for a single entity would also be much wider in range. This is great too. But,

1. What if I add "relationships between the entities" into this mix?

For example, retrieving "all customers who just visited a particular online storefront" is a usual real time audience query. But to retrieve "all customers who just visited a particular online storefront and also all the high spending members in their social group", is what I am talking about. Wait a minute, this sounds like the Facebook flavour of challenge? Yes. But as we saw before, Sift has the advantage of dealing with very deep first party data. Therefore the profiles in Sift would be much deeper, wider, "real time" (and could be much richer than Facebook itself). Therefore, adding relationships to this mix makes it insanely powerful.

2. Now, what if I also add an exciting DQL (Data Query Language) to manipulate the audience data?

Packed inside an intuitive API, the DQL would bring extreme flexibility and richness by which any client can combine, mix and match and manipulate the audience data. Such flexible APIs would make Sift a powerful DMP platform and a critical ingredient to embark on the data monetization journey. I think these additional 2 aspects are super convincing unique selling propositions by themselves. To add these capabilities, I ended up building a very special purpose database, which is also web-scale and in-memory. More on that exciting component later. The focus of this blog is to talk about the DQL and the APIs for playing with Sift's Audience Management feature. To start with, if you want to query the list of profiles(we call them 'Active Behaviour Clusters' or in short 'ABCs') that Sift tracks about any entity, you may try this query :
curl -d '{ "query" : "?:ABC","resultType" : "reducedValues", "executionMode" : "async"}' -H "Content-Type: application/json" -X POST http://13.76.166.53:8111/api/v1/query
Sift would return a result as below, depending on what is defined in this cloud instance of Sift at the time of query. (Yes, that ip address hits a sandbox with a purposely relaxed security)
{"status":"success","query":"?:ABC","processingTime(ms)":2,"total":40,"result":["HiRF_MiSL:ABC","MiRF_HiSL:ABC","MASS:ABC","RECOVERED_SILENT:ABC","MIGRANT_NEWBIE:ABC","SILENT_HIGH_ARPU:ABC","MIGRANT:ABC","ATL_GROUP:ABC","TOURIST:ABC","NEWBIE:ABC","BTL_GROUP:ABC","HiRF_HiSL:ABC","IN_SilomDistrict:ABC","ACTIVE:ABC","BTL_SPECIAL_GROUP:ABC","BURN_CHKBAL_CLUSTER:ABC","EXISTING:ABC","IN_AllSeasons:ABC","IN_SiamSquareShopping:ABC","BURN_SOS_CLUSTER:ABC","MiRF_MiSL:ABC","SUSPENDED:ABC","SILENT_SEED:ABC","MIGRANT_EXISTING:ABC","HiRF_LoSL:ABC","ROCKET_REFILL_NON_RESPONDE:ABC","BURN_MOBILEON_CLUSTER:ABC","REFILL_SEED:ABC","MiRF_LoSL:ABC","MASS_EXISTING:ABC","SILENT_LOW_ARPU:ABC","BURN_REFILL_CLUSTER:ABC","LoRF_LoSL:ABC","CLT_MNP_IMMEDIATE_FILE:ABC","CLT_MNP_HOLD_FILE:ABC","LoRF_MiSL:ABC","BURN_ENDCALL_CLUSTER:ABC","LOW_BALANCE:ABC","LoRF_HiSL:ABC","MASS_NEWBIE:ABC"]}
Let us dissect this result first. You would notice that the result is a Json output and it has an array to hold the actual results. Each entry in the result array ends with "ABC", just to say that this is an Active Behaviour Cluster. In the query you asked for ?:ABC. This is what I call a Schema Query, which would list all the ABCs. In the query, you also mentioned that that the "resultType" is "reducedValues", which means to list all the values combined in one array. You may also try this query by replacing the "reducedValues" to "count". Now, having seen the list of ABCs in Sift, you may also query how many customers (or entities) have been classified to be in a particular ABC.
curl -d '{ "query" : "ACTIVE:ABC","resultType" : "count", "executionMode" : "async"}' -H "Content-Type: application/json" -X POST http://13.76.166.53:8111/api/v1/query
I got this result but you may get a different one depending on the state at which you try
{"status":"success","query":"ACTIVE:ABC","processingTime(ms)":1,"result":100}
This is to say that there are 100 customers that are ACTIVE at this very second (you can say that these are people who have active contracts with a telco, a bank or a retail and are doing regular transactions). Now, what if I want to ask for ACTIVE customers who are also falling into LOW_BALANCE profile ?
curl -d '{ "query" : "(AND LOW_BALANCE:ABC ACTIVE:ABC)","resultType" : "count", "executionMode" : "async"}' -H "Content-Type: application/json" -X POST http://13.76.166.53:8111/api/v1/query
What if I want to ask how many such low balance customers are there in "Siam Square Shopping Center" geofence, right at this second ?
curl -d '{ "query" : "(AND IN_SiamSquareShopping:ABC (AND LOW_BALANCE:ABC ACTIVE:ABC))","resultType" : "count", "executionMode" : "async"}' -H "Content-Type: application/json" -X POST http://13.76.166.53:8111/api/v1/query
{"status":"success","query":"( AND IN_SiamSquareShopping:ABC (AND LOW_BALANCE:ABC ACTIVE:ABC))","processingTime(ms)":5,"result":29}
I got the above results. 29 of my 100 active customers are currently having a low balance and in the geofence I am tracking. If you want to find out who they are, you may change the resultType to "reducedValues" and you would get their anonymous identifiers. Isn't that insanely powerful? In reality, the results would be tens of millions and the API support pagination to fetch all the results. And you can quickly launch an engagement (campaigns) in Sift for them in a matter of few minutes by using Sift Portal. Kuldeep shares a very exciting blog on how you can use Sift's beautiful analytical visualisers to manipulate these audiences and launch engagements. For those DQL lovers, here is the BNF grammar version of the Sift's Audience Management DQL
GraphQuery -> SetOperationQuery GraphQuery -> ApplyOperationQuery GraphQuery -> FilterOperationQuery GraphQuery -> SchemaQuery GraphQuery -> EdgeQuery SetOperationQuery -> ( SET_OPERATION GraphQuery GraphQuery ) ApplyOperationQuery -> ( APPLY_OPERATION EDGE GraphQuery ) FilterOperationQuery -> ( FILTER_OPERATION pattern GraphQuery ) SchemaQuery -> \w+?\w+ EdgeQuery -> \w+:\w+ EDGE  -> \w+: SET_OPERATION -> AND|OR|DIFFERENCE APPLY_OPERATION -> APPLY FILTER_OPERATION -> FILTER pattern -> \w+=\w+
In the next part, how we an use Sift's DQL to query for audience with relationships. It excites me every time I talk about it.

Comments

Post a Comment

Popular posts from this blog

The Stunts for Attention

Some little distraction here before we get on with rebooting the universe. I heard some comments about the writing style of that post (Reboot the Universe - part 1) from various channels. The notable one came from Dhaks, universally recognized as Maams. He said, “hmm...u were alright when I met you last... ”. The genius-unexplored Henry was seconding Maams that “its the usual (weird?) Raja” . Hahhaha. Well, my sanity seems to have left with you Maams. Hereafter, please don’t leave me and go! (Nah, this is not a marriage proposal by any means :P). But honestly, I had my serious doubts about how that style would be received. Actually, I don’t have any fixed style of writing. Nor themes or subject matters. I deliberately keep it that way. In acting, there is such a thing called ‘method acting’. I gathered that its where the characters prevailing above and over the identity or the mannerisms of the actor himself. Daniel Day Lewis! Check him out in IMDB and compare (his real looks with) ...

Reboot the Universe - Final

It started here Little Trisha has 2 things that I like in her. One - she loves reading. Just that her books have rats as the heroes. Duh!! I have been asking her to grow up. Second - she is random. Radiantly random. This was the recent random thing we did together. That day, she influenced me to a soya milk from the Jolly-Bean’s. Now, soya milk can be boring as sewage. So to give it a bang, they add these ‘chewy pearls’ to it. For the uninitiated, chewy pearls are tiny sweet balls, made using tapioca and dropped lovingly into your drink. If you ever crave to nibble and bite into a soft human body part, you shall try the chewy pearls. So we walked out, bought our cups and started drinking. Baby : Jokes time, Daddy! Daddy : Okay…. Where would a bored cow go? Baby : Where? Daddy : Moooooovie, of course! Ouch!! She giggled at that and got chocked with a chewy pearl. And in a short struggle, she managed to shoot it out of her throat onto the floor. First we thought of cleaning it up. But t...

(Some(what)) Clear Thinking on a Cloudy Thing.

Me and Ajay brainstormed on the dynamics of an ultimate cloud computing environment. And I extracted our ideas into this (cartoon) context diagram. We are dealing with daunting levels of complexity in this area today. So ‘Abstraction’ becomes the key with which we approached this subject. An ever expanding box with a dashboard and a toilet man are the subtle(?) visual cues that convey the abstraction and ease of use. The levels of technological maturity as we expect here are not available today. This is the ‘FUTURE’. A Nirvana in Cloud Computing. And it speaks thus: The Cloud is the new operating system. Elastically growing and shrinking hardware are achieved already. The Cloud Management platform will maintain an inventory of the hardware capability - updating it as the hardware had been consumed and released. We will not deal at the application servers, BPM engines, databases level anymore. The tools will not matter. ‘DIY Blocks’ here refers to a catalog of pre-built ensembles that ...