Identifying cannabis use in social media is harder than it may seem at first. Python can be used but doesn't code the images. Hashtag and word matching are searchable and countable but will also miss visual portrayal/cues.
Content analysis of cannabis use can be done to catch visual portrayal/cues but in my experience it is often difficult to distinquish cannabis smoking from smoking cigarettes or self-rolled tobacco cigarettes, and unless the tobacco or cannabis is clearly shown or there is a verbal identfifier, the coder may not be sure what is in a pipe or bong (tobacco vs cannabis). One approach to get reliability (e.g. Krippendorff's alpha) on a difficult measure such as this would be to code for smoking not otherwise specified (cannabis or tobacco) and have a separate code for smoking identified visually or verbally as tobacco and another for cannabis. Use 2 or more coders and it is better to use 3 and train on at least 10% of the total. Also, there are many street names for drugs (cannabis, pot, weed etc..,) so it is important to train the coder for these.