Skip to main content

Sift Audience Data Query Language (DQL) – 1


I had been thinking long and hard on enhancing the Audience Management capability in Sift. Everybody seems to have it - from the mighty big players to the obscure ones. In such a crowded space, it is of prime importance to have a very clear and a striking "unique selling proposition" before you boast that you have this capability too. Basically, the tool shall solve a problem that was very challenging previously, in a cost effective way that was not thought of commonly. Now, Sift is a real time analytics engine. So obviously, the profiles it builds about individual entities are accurate - up to the second. That gets a lot of brownies to start with. And as Sift would limitlessly process data, the data points it can compute for a single entity would also be much wider in range. This is great too. But,

1. What if I add "relationships between the entities" into this mix?

For example, retrieving "all customers who just visited a particular online storefront" is a usual real time audience query. But to retrieve "all customers who just visited a particular online storefront and also all the high spending members in their social group", is what I am talking about. Wait a minute, this sounds like the Facebook flavour of challenge? Yes. But as we saw before, Sift has the advantage of dealing with very deep first party data. Therefore the profiles in Sift would be much deeper, wider, "real time" (and could be much richer than Facebook itself). Therefore, adding relationships to this mix makes it insanely powerful.

2. Now, what if I also add an exciting DQL (Data Query Language) to manipulate the audience data?

Packed inside an intuitive API, the DQL would bring extreme flexibility and richness by which any client can combine, mix and match and manipulate the audience data. Such flexible APIs would make Sift a powerful DMP platform and a critical ingredient to embark on the data monetization journey. I think these additional 2 aspects are super convincing unique selling propositions by themselves. To add these capabilities, I ended up building a very special purpose database, which is also web-scale and in-memory. More on that exciting component later. The focus of this blog is to talk about the DQL and the APIs for playing with Sift's Audience Management feature. To start with, if you want to query the list of profiles(we call them 'Active Behaviour Clusters' or in short 'ABCs') that Sift tracks about any entity, you may try this query :
curl -d '{ "query" : "?:ABC","resultType" : "reducedValues", "executionMode" : "async"}' -H "Content-Type: application/json" -X POST http://13.76.166.53:8111/api/v1/query
Sift would return a result as below, depending on what is defined in this cloud instance of Sift at the time of query. (Yes, that ip address hits a sandbox with a purposely relaxed security)
{"status":"success","query":"?:ABC","processingTime(ms)":2,"total":40,"result":["HiRF_MiSL:ABC","MiRF_HiSL:ABC","MASS:ABC","RECOVERED_SILENT:ABC","MIGRANT_NEWBIE:ABC","SILENT_HIGH_ARPU:ABC","MIGRANT:ABC","ATL_GROUP:ABC","TOURIST:ABC","NEWBIE:ABC","BTL_GROUP:ABC","HiRF_HiSL:ABC","IN_SilomDistrict:ABC","ACTIVE:ABC","BTL_SPECIAL_GROUP:ABC","BURN_CHKBAL_CLUSTER:ABC","EXISTING:ABC","IN_AllSeasons:ABC","IN_SiamSquareShopping:ABC","BURN_SOS_CLUSTER:ABC","MiRF_MiSL:ABC","SUSPENDED:ABC","SILENT_SEED:ABC","MIGRANT_EXISTING:ABC","HiRF_LoSL:ABC","ROCKET_REFILL_NON_RESPONDE:ABC","BURN_MOBILEON_CLUSTER:ABC","REFILL_SEED:ABC","MiRF_LoSL:ABC","MASS_EXISTING:ABC","SILENT_LOW_ARPU:ABC","BURN_REFILL_CLUSTER:ABC","LoRF_LoSL:ABC","CLT_MNP_IMMEDIATE_FILE:ABC","CLT_MNP_HOLD_FILE:ABC","LoRF_MiSL:ABC","BURN_ENDCALL_CLUSTER:ABC","LOW_BALANCE:ABC","LoRF_HiSL:ABC","MASS_NEWBIE:ABC"]}
Let us dissect this result first. You would notice that the result is a Json output and it has an array to hold the actual results. Each entry in the result array ends with "ABC", just to say that this is an Active Behaviour Cluster. In the query you asked for ?:ABC. This is what I call a Schema Query, which would list all the ABCs. In the query, you also mentioned that that the "resultType" is "reducedValues", which means to list all the values combined in one array. You may also try this query by replacing the "reducedValues" to "count". Now, having seen the list of ABCs in Sift, you may also query how many customers (or entities) have been classified to be in a particular ABC.
curl -d '{ "query" : "ACTIVE:ABC","resultType" : "count", "executionMode" : "async"}' -H "Content-Type: application/json" -X POST http://13.76.166.53:8111/api/v1/query
I got this result but you may get a different one depending on the state at which you try
{"status":"success","query":"ACTIVE:ABC","processingTime(ms)":1,"result":100}
This is to say that there are 100 customers that are ACTIVE at this very second (you can say that these are people who have active contracts with a telco, a bank or a retail and are doing regular transactions). Now, what if I want to ask for ACTIVE customers who are also falling into LOW_BALANCE profile ?
curl -d '{ "query" : "(AND LOW_BALANCE:ABC ACTIVE:ABC)","resultType" : "count", "executionMode" : "async"}' -H "Content-Type: application/json" -X POST http://13.76.166.53:8111/api/v1/query
What if I want to ask how many such low balance customers are there in "Siam Square Shopping Center" geofence, right at this second ?
curl -d '{ "query" : "(AND IN_SiamSquareShopping:ABC (AND LOW_BALANCE:ABC ACTIVE:ABC))","resultType" : "count", "executionMode" : "async"}' -H "Content-Type: application/json" -X POST http://13.76.166.53:8111/api/v1/query
{"status":"success","query":"( AND IN_SiamSquareShopping:ABC (AND LOW_BALANCE:ABC ACTIVE:ABC))","processingTime(ms)":5,"result":29}
I got the above results. 29 of my 100 active customers are currently having a low balance and in the geofence I am tracking. If you want to find out who they are, you may change the resultType to "reducedValues" and you would get their anonymous identifiers. Isn't that insanely powerful? In reality, the results would be tens of millions and the API support pagination to fetch all the results. And you can quickly launch an engagement (campaigns) in Sift for them in a matter of few minutes by using Sift Portal. Kuldeep shares a very exciting blog on how you can use Sift's beautiful analytical visualisers to manipulate these audiences and launch engagements. For those DQL lovers, here is the BNF grammar version of the Sift's Audience Management DQL
GraphQuery -> SetOperationQuery GraphQuery -> ApplyOperationQuery GraphQuery -> FilterOperationQuery GraphQuery -> SchemaQuery GraphQuery -> EdgeQuery SetOperationQuery -> ( SET_OPERATION GraphQuery GraphQuery ) ApplyOperationQuery -> ( APPLY_OPERATION EDGE GraphQuery ) FilterOperationQuery -> ( FILTER_OPERATION pattern GraphQuery ) SchemaQuery -> \w+?\w+ EdgeQuery -> \w+:\w+ EDGE  -> \w+: SET_OPERATION -> AND|OR|DIFFERENCE APPLY_OPERATION -> APPLY FILTER_OPERATION -> FILTER pattern -> \w+=\w+
In the next part, how we an use Sift's DQL to query for audience with relationships. It excites me every time I talk about it.

Comments

Post a Comment

Popular posts from this blog

A Date - Part 1

“Good Morning. How is everything? Did you manage to have an early breakfast date with yourself?” I was on the bed in a hotel room in Jakarta when that SMS woke me up. I only managed to get a very few hours of sleep that night. Even in that, my mind went churning non-stop on some meaningless, unconnected things. I was literally aware of all that turmoil in the half sleep, twisting and tossing through it. Back to the SMS. All the SMS I get usually are with a single consistent purpose - my colleagues trying to find my whereabouts. Note - none of them is a hot babe. (This is altogether a different topic - how come no hot chick in this whole damn world manages to become an IT architect!!) Oh, for a change, I do occasionally get some irritating marketing campaigns. So, it is not a wonder that this message sprang me awake like a Maasai Warrior. (Maasai are an African tribe famous for their ability to rise from the deepest sleep to a state of total combat readiness in a matter of seconds)...

Reboot the Universe - Part 1

Let me warn you first. This is going to be a damn serious reading. I have access to a certain secret button. The button that reboots the Universe! At this point, if you are my wife or anyone else that doesn’t take me seriously, you better quit reading. If you are chatty Praful, naughty Babul, genius-explored Thakkali, genius-unexplored Hendry I, genius-recently-explored Henry A, Hilarious K, Cy - a big Welcome! If you are a stranger here, you have beaten an odd of 2,317,563,890,001 : 1 in hitting an intelligent blog. Congrats! If you are me, you are really sexy, sensational, cool, etc etc and btw finish writing this fast. It all started when she settled in the Starbucks at 11AM, Latitude14° 35’ N and Longitude 120° 57E. It was a working day. Hush, don’t ask. She is random. Dazzlingly random. And she needs to be in random places to get into her genius moments. But that crumby place was so noisy that she had to plead God’s help. Then she suddenly realized (or over heard from the noise...

The Great Indian Divide - Part 1

Have you combed an Indian metro recently? What struck you as the blinding flash of the obvious? The shining India? AND also in a few meters, a POOR nation caught in the slip stream of capitalism? An economic divide ? AND hence a cultural divide ? In short, did you notice the Class Conflict ? It’s the same class conflict or “alienation” that Karl Marx saw in the late 1800s. This has been a cause of my reluctance to move back to India. Because common sense and history tells that such class conflicts will result in social unrest and ugly revolutions as small as “mangalore bar attacks” (google it) to something as catastrophic as the French Revolution. What is the cause for this great Indian divide? The answer is “Knowledge Economy” and the “Knowledge Workers”. Let me dummify these terms first. Think of a line (like the poverty line) called the machine line . You are above the machine line if machines are your slaves. For example, a biotechnologist or a banker who uses a computer to get hi...