Skip to main content

Productivity and analysis tools for online marketing

Project description Documentation Status
🎉 Enhanced: Text Analysis for Online Marketers
Part 1: SEMrush article on two text analysis techniques with examples.
Part 2: Kaggle notebook expanding on the concepts with new functions as well enhancements to existing functions.

advertools: create, scale, and manage online campaigns

A digital marketer is a data scientist.
Your job is to manage, manipulate, visualize, communicate, understand, and make decisions based on data.

You might be doing basic stuff, like copying and pasting text on spread sheets, you might be running large scale automated platforms with sophisticated algorithms, or somewhere in between. In any case your job is all about working with data.

As a data scientist you don’t spend most of your time producing cool visualizations or finding great insights. The majority of your time is spent wrangling with URLs, figuring out how to stitch together two tables, hoping that the dates, won’t break, without you knowing, or trying to generate the next 124,538 keywords for an upcoming campaign, by the end of the week!
advertools is a Python package, that can hopefully make that part of your job a little easier.

I have a tutorial on DataCamp that demonstrates a real-life example of how to use Python for creating a Search Engine Marketing campaign. There is also a project to practice those skills in an agency / case study setting.

I also have an interactive tool based on this package, where you can generate keyword combinations easily.

Main Uses:

  • Generate keywords: starting from a list of products, and a list of words that might make sense together, you can generate a full table of many possible combinations and permutations of relevant keywords for that product. The output is a ready-to-upload table to get you started with keywords.
>>> import advertools as adv
>>> adv.kw_generate(products=['toyota'],
                    words=['buy', 'price'],
...        Campaign Ad Group           Keyword Criterion Type
    0  SEM_Campaign   toyota        toyota buy          Exact
    1  SEM_Campaign   toyota      toyota price          Exact
    2  SEM_Campaign   toyota        buy toyota          Exact
    3  SEM_Campaign   toyota      price toyota          Exact
    4  SEM_Campaign   toyota  toyota buy price          Exact
  • Create ads: Two main ways to create text ads, one is from scratch (bottom-up) and the other is top down (given a set of product names).
  1. From scratch: This is the traditional way of writing ads. You have a template text, and you want to insert the product name dynamically in a certain location. You also want to make sure you are within the character limits. For more details, I have a tutorial on how to create multiple text ads from scratch.
>>> ad_create(template='Let\'s count {}',
              replacements=['one', 'two', 'three'],
              fallback='one', # in case the total length is greater than max_len
["Let's count one", "Let's count two", "Let's count three"]

>>> ad_create('My favorite car is {}', ['Toyota', 'BMW', 'Mercedes', 'Lamborghini'], 'great', 28)
['My favorite car is Toyota', 'My favorite car is BMW', 'My favorite car is Mercedes',
'My favorite car is great'] # 'Lamborghini' was too long, and so was replace by 'great'
  1. Top-down approach: Sometimes you need to start with a given a list of product names, which you can easily split them into the relevant ad slots, taking into consideration the length restrictions imposed by the ad platform. Imagine having the following list of products, and you want to split each into slots of 30, 30, and 80 characters (based on the AdWords template):
>>> products = [
    'Samsung Galaxy S8+ Dual Sim 64GB 4G LTE Orchid Gray',
    'Samsung Galaxy J1 Ace Dual Sim 4GB 3G Wifi White',
    'Samsung Galaxy Note 8 Dual SIM 64GB 6GB RAM 4G LTE Midnight Black',
    'Samsung Galaxy Note 8 Dual SIM 64GB 6GB RAM 4G LTE Orchid Grey'
>>> [adv.ad_from_string(p) for p in products]
... [['Samsung Galaxy S8+ Dual Sim', '64gb 4g Lte Orchid Gray', '', '', '', ''],
     ['Samsung Galaxy J1 Ace Dual Sim', '4gb 3g Wifi White', '', '', '', ''],
     ['Samsung Galaxy Note 8 Dual Sim', '64gb 6gb Ram 4g Lte Midnight', 'Black', '', '', ''],
     ['Samsung Galaxy Note 8 Dual Sim', '64gb 6gb Ram 4g Lte Orchid', 'Grey', '', '', '']]
Each ad is split into the respective slots, making sure they contain complete words, and that each slot has at most the specific number of slots allowed.
This can save time when you have thousands of products to create ads for.
  • Analyze word frequency: Calculate the absolute and weighted frequency of words in a collection of documents to uncover hidden trends in the data. This is basically answering the question, ‘What did we write about vs. what was actually read?’ Here is a tutorial on DataCamp on measuring absolute vs weighted frequency of words.
  • Extract important elements from social media posts: Get the more informative
    elements of social media posts (hashtags, mentions, emoji). You also get some basic statistics about them. Check out a more detailed tutorial on Kaggle, on how to extract entities from social media posts using these functions.
>>> posts = ['i like #blue', 'i like #green and #blue', 'i like all']
>>> hashtag_summary = adv.extract_hashtags(posts)
>>> hashtag_summary.keys()
dict_keys(['hashtags', 'hashtags_flat', 'hashtag_counts', 'hashtag_freq',
           'top_hashtags', 'overview'])

what are the hashtags?
>>> hashtag_summary['hashtags']
[['#blue'], ['#green', '#blue'], []]

>>> hashtag_summary['top_hashtags']
[('#blue', 2), ('#green', 1)]

How many were there per post?
>>> hashtag_summary['hashtag_counts']
[1, 2, 0]

And you can do the same for mentions and emoji (with the textual name of each emoji).

The package is still under heavy development, so expect a lot of changes.
Feedback and suggestions are more than welcomed.


pip install advertools


Function names mostly start with the object you are working on:

kw_: for keywords-related functions
ad_: for ad-related functions
url_: URL tracking and generation
extract_: for extracting entities from social media posts (mentions, hashtags, emoji, etc.)
twitter: a module for querying the Twitter API and getting results in a pandas DataFrame
serp_: get search engine results pages in a DataFrame, currently available: Google and YouTube


  • Changed
    • serp_goog with expanded pagemap and metadata
  • Fixed
    • serp_goog errors, some parameters not appearing in result df
0.7.3 (2019-04-17)
  • Added
    • New function extract_exclamations very similar to extract_questions
    • New function extract_urls, also counts top domains and top TLDs
    • New keys to extract_emoji; top_emoji_categories & top_emoji_sub_categories
    • Groups and sub-groups to emoji db
0.7.2 (2019-03-29)
  • Changed
    • Emoji regex updated
    • Simpler extraction of Spanish questions
0.7.1 (2019-03-26)
  • Fixed
    • Missing __init__ imports.
0.7.0 (2019-03-26)
  • Added
    • New extract_ functions:

      • Generic extract used by all others, and takes arbitrary regex to extract text.
      • extract_questions to get question mark statistics, as well as the text of questions asked.
      • extract_currency shows text that has currency symbols in it, as well as surrounding text.
      • extract_intense_words gets statistics about, and extract words with any character repeated three or more times, indicating an intense feeling (+ve or -ve).
    • New function word_tokenize:

      • Used by word_frequency to get tokens of 1,2,3-word phrases (or more).
      • Split a list of text into tokens of a specified number of words each.
    • New stop-words from the spaCy package:

      current: Arabic, Azerbaijani, Danish, Dutch, English, Finnish, French, German, Greek, Hungarian, Italian, Kazakh, Nepali, Norwegian, Portuguese, Romanian, Russian, Spanish, Swedish, Turkish.

      new: Bengali, Catalan, Chinese, Croatian, Hebrew, Hindi, Indonesian, Irish, Japanese, Persian, Polish, Sinhala, Tagalog, Tamil, Tatar, Telugu, Thai, Ukrainian, Urdu, Vietnamese

  • Changed
    • word_frequency takes new parameters:

      regex: defaults to words, but can be changed to anything ‘S+’ to split words and keep punctuation for example.

      sep: not longer used as an option, the above regex can be used instead

      num_list: now optional, and defaults to counts of 1 each if not provided. Usefull for counting abs_freq only if data not available.

      phrase_len: the number of words in each split token. Defaults to 1 and can be set to 2 or higher. This helps in analyzing phrases as opposed to words.

    • Parameters supplied to serp_goog appear at the beginning of the result df

    • serp_youtube now contains nextPageToken to make paginating requests easier

0.6.0 (2019-02-11)
  • New function
    • extract_words to extract an arbitrary set of words
  • Minor updates
    • ad_from_string slots argument reflects new text ad lenghts
    • hashtag regex improved
0.5.3 (2019-01-31)
  • Fix minor bugs
    • Handle Twitter search queries with 0 results in final request
0.5.2 (2018-12-01)
  • Fix minor bugs
    • Properly handle requests for >50 items (serp_youtube)
    • Rewrite test for _dict_product
    • Fix issue with string printing error msg
0.5.1 (2018-11-06)
  • Fix minor bugs
    • _dict_product implemented with lists
    • Missing keys in some YouTube responses
0.5.0 (2018-11-04)
  • New function serp_youtube
    • Query YouTube API for videos, channels, or playlists
    • Multiple queries (product of parameters) in one function call
    • Reponse looping and merging handled, one DataFrame
  • serp_goog return Google’s original error messages
  • twitter responses with entities, get the entities extracted, each in a separate column
0.4.1 (2018-10-13)
  • New function serp_goog (based on Google CSE)
    • Query Google search and get the result in a DataFrame
    • Make multiple queries / requests in one function call
    • All responses merged in one DataFrame
  • twitter.get_place_trends results are ranked by town and country
0.4.0 (2018-10-08)
  • New Twitter module based on twython
    • Wraps 20+ functions for getting Twitter API data
    • Gets data in a pands DataFrame
    • Handles looping over requests higher than the defaults
  • Tested on Python 3.7
0.3.0 (2018-08-14)
  • Search engine marketing cheat sheet.
  • New set of extract_ functions with summary stats for each:
    • extract_hashtags
    • extract_mentions
    • extract_emoji
  • Tests and bug fixes
0.2.0 (2018-07-06)
  • New set of kw_<match-type> functions.
  • Full testing and coverage.
0.1.0 (2018-07-02)
  • First release on PyPI.
  • Functions available:
    • ad_create: create a text ad place words in placeholders
    • ad_from_string: split a long string to shorter string that fit into
      given slots
    • kw_generate: generate keywords from lists of products and words
    • url_utm_ga: generate a UTM-tagged URL for Google Analytics tracking
    • word_frequency: measure the absolute and weighted frequency of words in
      collection of documents

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for advertools, version 0.7.4
Filename, size File type Python version Upload date Hashes
Filename, size advertools-0.7.4-py2.py3-none-any.whl (185.3 kB) File type Wheel Python version py2.py3 Upload date Hashes View hashes
Filename, size advertools-0.7.4.tar.gz (217.4 kB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page