Udta Punjab Controversy: Another Example Of Streisand Effect

This post will be a break from the usual and just examine at how the Indian Censor Board has brought even more attention to an offbeat film [by Bollywood standards], one that addresses the drug menace in a local province called Punjab.

After the Indian censor board suggested 89 cuts in the movie, including removing reference to the province name, it's kinda clear that politics plays a bigger part there - especially when elections are coming up in the same province.

What the censor board didn't expect what the movie industry to virtually unite behind the movie - calling the cuts archaic and completely opposing freedom of expression via cinematic means.

Here's how it looks on Google Trends:

16th Apr: Movie trailer got released and it generated some interest [I'd watch this movie!].

28th May: The first round of movie screening process was completed a week ago and the makers deny claims that the movie will be banned due to the topic.

9th June: Creativity is killed as the censor board actually suggested 89 cuts in a movie [making it kinda pointless to even release it]. This actually opened up a can of worms with celebrities, newspapers going after the rather ludicrous idea of incorporating these cuts.

The interest line between 16th Apr and 28th May shows that interest in the movie was reasonably picking up with nothing particularly shocking about the content. Once rumours started flying about a possible ban [around 28th May], everyone's head turned to this movie to know what's it all about.

Related searches in Google Trends already suggests that the soundtrack is already getting picked up among popular searches.


On Twitter, the #udtapunjab has taken all forms with the movie characters getting featured, the main actors, censor board and of course, social commentary behind the reality of situation and how critical is it.

Right above, you can see the same data for the #udtapunjab really growing in the last one week.

If/when the movie gets released, it's now bound to a success, thanks to some innovative ideas from the censor board. The matter went to Bombay High Court and it seems, the same censor board has now approved the movie with an Adult rating [sure...] and 13 cuts. 

Here's some of the most popular RT's coming from celebrities in favour of the movie.


As of now, the trailer has been watch 12 M times on YouTube already. 73 K likes / 5.9 K dislikes and 6.5 K comments. Just scrolling through comments shows people hating on the censor board and the ruling political party in the province among other issues.

Have a look. Anything worth censoring in the first place...? :)






How To Get Blog Post Category From Squarespace To Google Analytics: DOM Method In GTM / Solution# 2

In the previous post on this topic, I covered how to scrape the DOM element value for the blog category and pass it as a variable and then into GA.

Here's the link to Solution #1 for getting blog post category from SquareSpace to Google Analytics.

A really quick recap is that we create a DOM element variable, define a trigger that includes this variable and then fire an event where action can be the DOM element {{Blog Category}} and label {{page path}}.

This is one method. The other method can be to use custom dimensions. 

Why would one use custom dimensions?

Here are some good examples of custom dimension usage from a Quora thread.

Even though knowing the blog category is fine, we need to first create a CD in Google Analytics and then decide the scope [hit, session, user]. This is important for GA to understand how this dimension operates.

e.g. If we chose hit for blog category, we would get hit metrics such as pageviews

If we choose session, then all the pages seen in a single session would get classified with the page category of the last page.

If we chose user, we would classify the user as a reader of a particular blog category type [so everytime this user shows up, the same category would be assigned].

A more detailed example is explained here. https://support.google.com/analytics/answer/2709828?hl=en#scope

In the blog category page, it would a hit metric.

Once we define a CD in GA, we then go to GTM for the tag.

In the event method, we created a event based trigger. In CD page. we create a pageview trigger instead where the CD matches the blog category.

and you're done. Once in GA, you can now combine it with other dimensions to create hit level reports and analyze the data more easily.

Debugging GTM: Data Layer Variable sending undefined value

Recently ran into this issue where a dataLayer was defined and showing in debug mode but GA would keep randomly sending out undefined as category/action/label values in GA - which was quite frustrating for the developer because the right details were indeed getting passed in DOM.

Something was going wrong between DOM and GA. Here's what I had going before seeking help in Google Tag Manager Product forum.


  1. Define dataLayer event to pick up category/action/label [action and label parts of event were dynamic and captured from page interaction.
  2. Create these dataLayer as unique category/action/variables for GTM to pick up the values.
  3. Created a click based trigger where when the page interaction took place + category/action/label matched RegEx .*  , an event fired.
  4. GA event created with action and label fields being the dataLayer values while the trigger being in step#3.

As dumb as it sounds now, step#3 was the problem where category/action/label were created as an event and not just a click. Changing the trigger in step#3 from click based to Custom Event where the event = the name of the dataLayer event name [not category/action/label] plus changing the matches RegEx .* to does not contain 'undefined' makes sure that we have values in place.

Here's the link to Google Tag Manager Product Forum where I got help in fixing this.



How to get blog post category from Squarespace to Google Analytics: DOM method in GTM / Solution# 1

Getting Squarespace to work with Google Tag Manager is kind of a pain. Instead of using the async GA code into code injection setting, I opted for some search and found this handy JQuery script from Chris Eisenbraun that works amazingly well with GTM: Link to GTM script for SquareSpace.

Ok, so now that I have GTM up and running on this blog, the next question I had was related to which blog categories work well compared to others. After reaching out to SquareSpace support, was informed that dataLayer method isn't available in SS yet....hhhmmm.

This presents a challenge on how to get this done. In this post, I've used a video tip from MeasureSchool to capture the blog category by using the DOM Element variable in GTM.

Main steps in implementing this:

1. Finding out the name of the DOM element that has the blog post category.

Let's use a previous blog post from this blog as an example and scroll to the end where the post category [google analytics] is mentioned.


Right click on the category [google analytics] and choose 'Inspect Element'. Hover over the class and you'll see the blog category highlighted.

Go to the Console tab in dev mode and type 'document.querySelectorAll("span.categories")' Enter. Drop down and you should see the category.

2. Now that we know the DOM element that captures the blog category, we go back to GTM and create a variable that captures this particular DOM Element.

3. Create a GTM Trigger that we can use in our events.

Since we need to capture this as soon as a page is loaded and not on click, we will keep it as a Page View trigger with the Regex = .* . This would capture all the categories.

4. Create GTM event based on this trigger.

The Event Category is labelled as Blog Category while the Action and Label is dynamic.

Event Action = {{Blog Category}} captures the value in DOM element variable while Event Label = {{Page Path}} captures the blog post link except the analyticslog.com part.

If a blog post does not have category assigned to it, Event Action will show as null. This is because Event Action is a required field.

First, we test it in Debug mode to be sure this is working. Open up an article and with DOM Ready left tab, go to Variables. This will now show the blog post category. Go back into GA and we now have the values in as events.

null is where a blog post did not have category assigned to it.

If you have more than one blog post category, it will capture as category1, category 2.

The event method can help where Custom Dimensions have been exhausted.  Will do a separate blog post on custom dimension method.

Here's the reference video I used from MeasureSchool in implementing this.

Update: Here's the Custom Dimension method for the same requirement.



Deeper Integration of Search Console and Google Analytics.

This was long overdue but it's happening. It's a little surprising but still common to see websites that don't have their Search Console data connected to GA.

By connecting SC to GA, the former is able to pass some of the query data to GA [taking care of some of the 'why can't I see my keywords in organic reports?' issue].

This announcement gives an even greater reason to connect it immediately as the feature is being rolled out.

While the queries report will continue to be limited, the new integration of SC will be at a landing page level. This means that users can now use the ABO/ABC approach to get organic search metrics, right from search impressions down to conversion.

This helps in several ways:

If a landing page does well in conversion rates but has low search CTR's, it could be investigated for meta descriptions copy, keyword ranking and relevance. Increasing the CTR's for such traffic would help with the CR on the website.

Another application could be reducing the bounce rates for high click-through organic landing pages. Is the web page delivering on the search? Can the user understand the page within 5 sec? Are navigation buttons/CTAs clearly laid out? 

Is this feature really new and was not possible without the new integration? No...

GA landing pages report would give you metrics starting Sessions.

Search Console > Search Traffic > Search Analytics reports

Change the radio button to landing pages instead of queries and get Impressions + Clicks metrics from here. Join these to the GA Landing pages report [under Behavior] via VLOOKUP functions to add impressions and Clicks and you have the same report ready...landing page name would be the primary key for both these tables. 

For a multi product site, they could analyze by product categories [example landing page www .buyfromus,com/catrgory1/productA.

Just by doing sone quick data labeling in Excel via SEARCH or MID function, you'd know the performance by categories plus where you need to channel more efforts.

 you'd have to go the same for country / devices data to get such metrics.

....so, if this can be done within the existing framework with a bit of work,is it a really deep integration...?

and that's why there was a big of Twitter banter between Simo Ahava [from Reaktor, GTM expert] and Daniel Waisberg [GA Evangelist].



Excel Tricks For Digital Analysts - Part 2

In part 1 of this post, we focused mainly on text extraction [and labelling it] to identify contest posts/questions/audience tags etc. Here's a link to part 1 of the post: Excel tricks for digital analysts.  

1. Combining tables where some of the columns are different.

If you downloaded post level insights [in csv] for two different time periods, there is a possibility that the number of columns might not match, especially engagement metrics for video [if the post type wasn't used in the period]. A csv Facebook post insights download could have approximately 90 columns. Therefore, finding the missing/extra columns between two csv files would be rather impractical. 

How would you find this in Excel?

Power Query plug-in can help solve this problem by creating a third pivot table [combining the two csv data insights] as long as there is a primary key. In this case, this could be a post ID.

Let's say our likes and comments are in the first download and the shares for the same posts are in the second file. Both files have a post ID column where the posts are the primary key [1 -10]. This is how the files would look.

Open Query tab and add data from table.

Now click on New Source > File from the right hand corner and add the second file.

Now if you appended the first table to the second one, you would see something like this:

Once you pivot this again, you'd have your likes, comments and shares by the unique post ID.

How to install Power Query. 

2. Finding number of words in a post.

While number of characters in a post can be easily determined using the LEN formula, finding the number of words requires a bit of trick.

How would we do this in Excel?


Breakdown of this formula:

First part:

TRIM function removes extra spaces in a cell. This can be helpful in case of web downloads.

LEN function would then count the number of characters in the cell.

Second part:

SUBSTITUTE function removes blank spaces and then counts the number of character (after trimming the text).

The difference in LEN between the first and second part of the formula tells us how many blank spaces we have. Adding +1 to tricks Excel into giving us the number of words. To understand this a bit more. The number of words in a sentence will always be the number of spaces between words + 1.

  • "Sample social post" - Three words in sentence, has two spaces in between words.
  • "Another sample social post" - Four words in sentence, has three spaces in between words.

Add +1 to both these examples and you have the number of words in this sentence.

3. Extracting the date from a Twitter Analytics .csv file

Here's an easy one...Twitter's Tweet data is in .csv format with the Time column having this format [yyyy-mm-dd hh:mm +0000] the hh:mm +0000 signifies the GMT time, not your local time.

If you tried extracting any aspect of the date such as MONTH function, you'd get a #VALUE error....which is why you need a column to convert the text into numbers.

How would we do this in Excel?

Using the DATE function, we can convert the standard format into dates that we can use. The syntax for DATE is (year, month, day)


The date patterns are thankfully consistent i.e. the month is not shown as m but mm...This makes it much easier to extract.

We can nest LEFT and MID functions within the DATE function to get this.

LEFT(D2,4)*1 checks D2 and extracts the first four characters as a string. Multiplying by 1 converts it to a number.

Similarly, MID(D2,6,2) extracts the month while MID(D2,9,2) extracts the day.

Once you're able to extract all these numbers, the DATE function understands these as inputs to determine the date and voila, you now have your dates ready.


More tips on the way. If you have questions that you'd like answered or easier ways to implement these ideas, please let me know via comments, on Twitter: @akhantweets  or via the contact page here.

Link to previous tips in this series:

Excel Tricks For Digital Analysts - Part 1

Thanks for reading.



Excel tricks for digital analysts - Part 1

As a digital analyst, I probably end up using Excel just every few hours. 

I think there are a lot of tricks/formulas that analysts can use to better explain the story. Besides your reporting platform, having the ability to wrangle data in Excel for ad-hoc analyses definitely gives you flexibility to shape your story. I'll use examples that analysts will find themselves running into...hopefully, this post will help.

1. Finding out which posts were contest based [and therefore, had a higher engagement rate] 

Contest posts usually perform very well and when you're evaluating content, you should mark them as outliers or include it but point it out to others. Knowing the impact of contest posts for always-on content helps you plan better.

How would you find this out in Excel?

Your contest post message copy could contain a hashtag or not. But one thing's given, it'll probably have the words such as 'contest', 'win' in the copy to ring a bell with the audience and get 'em participating.

Here's a screenshot of dummy data from Facebook Insights post level data in csv.

Within the text, the formula checks for the word "contest" in D5 using the SEARCH function...If it finds the word, it tells you the ordinal position from where the word starts. Once you have that, you ask Excel if this cell is now a number? As you have an ordinal position, it's a TRUE....and if it's a TRUE, you can now use an IF function to label the column as "Contest Post"...if FALSE, it goes as "Non contest post"...Put this into Pivot tables and see how contests help bump up engagements.

2. Extracting Twitter profile names from Tweets you sent out, adding the URL to their profile as text and hyperlinking it.

Say, you had a really good month and engaged with a host of Twitter folks. Now that you've broken the ice with 'em, your client has asked for a list of users you've interacted with...hmmm.

How would you do this in Excel?

Once you download your sent tweets from Twitter, your pattern is that the tweets sent out to users have a space between the user's handle name and the actual copy...You do have two challenges:

  1. The challenge here is that each user will have a different length[characters] for username...so your pattern is the space between user profile and the tweet " ".
  2. Your brand's tweets are going to be in some rows in between and we need to first know if a tweet was sent to an individual user or posted as part of the content calendar?


Ok, so there's quite a bit going on here...let's break it down:

We only want to extract the user profile links from posts that are sent out to other users...brand tweets posted from the content calendar need not be included here...

We use an IF function to check if the first character of the text is "@"...this would mean that that we sent a tweet to an individual user...

If yes, We use the MID function to start extracting the text from the second position,

We continue extracting text in the MID function till we find " " [space between user handle and copy]...we then subtract 2 characters from it to be exactly till the end of user profile.

We then append a text string to it with "www.twitter.com/".

and finally, we hyperlink the user profile link using the HYPERLINK function....

IF the first character is not "@", then it's part of the FALSE condition and we just label it as "brand post".

fun, right?

3. Finding out engagement rates by audience tags being used in Facebook posts.

With the new preferred audiences tags in Facebook, post data download now has a new column 'audience tags'. What if you wanted to know which tags are helping posts with engagement rates?

How would you do this in Excel?

You could easily use a Text to Columns feature under Data tab to separate out the tags into columns like this...

Now you have the number of tags used and the engagement rates for posts...

You could look at relationships to find out the ideal number of audience tags....

You could create a new column to check if generic/branded/competitor audience tags (using the function in the first tip) perform better in terms of engagement rate.

4. Finding out which posts contain a question

So you already know the performance on link/photo/video/status posts...what if you could add a layer in your pivot table to segment the data for question posts...check if they get more comments from your community...

So you used the steps in the first tip to find '?' in the text...but this time, column D wrongly says that text "we've read" is a question post when it isn't...

This is because '?' is a wildcard operator and Excel cannot understand that you're trying to find cells that contain '?'

How would you find this in Excel?

Adding the tilda key in the SEARCH function tells Excel that you're NOT using the question mark as a wildcard operator...

Now when you run the same nested function from the first tip, column E correctly returns "general" for the last text post.

That's all the time I have now...will keep adding posts as part of 'Tricks for digital analysts' series.

If you're looking for help with Excel, some cool places to learn:

  • Mr. Excel forum
  • Excel subreddit
  • Excel Is Fun YT Channel
  • Chandoo.org



Changes To Facebook's 20% Text Rule

Just saw this tweet 24 hr ago from @mattjuniper and it immediately got my attention. The 20% text rule has often resulted in ads getting disapproved even when the actual text (not grids) is less than 20% of the image.


Following the tweet, it took me to WeRSocial's blog Following the tweet, it took to WeRSocial's blog http://wersm.com/has-facebook-finally-removed-the-20-rule/ 

Without rehashing the linked blog content, what's really interesting is that Facebook will not prevent ads with more 20% text from going live. Instead, content would be categorized into:

  • Image text: OK – you ad image contains little or no text
  • Image text: Low – your ad’s reach may be slightly limited
  • Image text: Medium – your ad’s reach may be limited
  • Image text: High – you may not reach your audience

So, much text should go into the ad and is the first quadrant [Image text: OK] = 20%? Probably, but Facebook would not want to explicitly mention the percentage as it would mean all content creators rushing towards 20% (or higher/lower) when there's so much context that will determine a photo post's final reach.

From an media perspective, this would definitely impact the Cost Per Reach / Cost Per Engagement metrics as brands would have to choose visual content that has just enough text to retain user attention. 

This change seems to have been applied to UK/Ireland only [for now] http://adigitalboom.com/update-facebook-removes-20-text-overlay-rule-for-ads-uk-ireland/




[New feature in Facebook] How Preferred Audience targeting can help small pages

Facebook's newest feature just got rolled out to all pages. Understanding that all content is not equal, Facebook had previously provided Interest based targeting (in ads only) allowing users to search for audiences based on keywords. While this was good, the catch was that it could be only be used with Facebook ads and not organically.

Audience Optimization has three main features:

  • Preferred audiences
  • Audience restrictions [same as post/newsfeed targeting]
  • Audience Insights


Let's talk about the first one, Preferred audiences. The main difference here is that Facebook will prioritize post delivery to those users whose interests match the post's preferred audience rules, without limiting the overall post reach.

As organic reach continues to diminish, smaller pages (businesses) contemplating moving to Facebook ads can now get data about how to route their organic content (previously unavailable) before testing the waters with Facebook ads. The post insights report from the page should probably follow in terms of adding a new column containing interest names used in post targeting (it already has languages and countries targeted). 

As pages begin to understand which audiences are reacting to their organic content, they'd be more willing to jump in with Facebook ads - knowing whom to target, getting more value of their limited budgets.

Here's a screenshot of trying to use 'Baseball' as an Interest while creating a post. What's different so far is that the potential audience size is not available here. This is something that was for Interest targeting in FB Ads - giving advertisers a good idea about whether a particular interest segment is too small to warrant an ad set (or if multiple interests need to be combined to create a more feasible audience size).

Edit: Just checked this feature again. The audience size shows up after you add the interest...not while you're deciding which interest to add.


facebook preferred audience targeting

As the objective of Preferred Audience targeting is to help direct posts to the right audiences, audience sizes would probably be shown - bringing more clarity into where to direct posts.

The more (organic) audience insights Facebook shares with page owners, the more likely they are to start advertising there. 

I'm yet to test posts using this targeting but it'll be cool to check back in a few weeks on what data can we find in the post itself / data downloads taken from Facebook Insights. Thread to be updated.