My Anaconda DOES Want Some
I consider myself to be a pretty technical TPM, but it mostly manifests as an understanding of the way systems are arranged architecturally, being able to meaningfully comment on engineering designs and crunching numbers via SQL or some other equivalent data analysis language.
You can do all of those things through a browser though, which means it has been a long time since I've needed to struggle through setting up a local development environment in order to do something.
I bet you can guess where this is going.
The Hell Is This?
I spend a solid chunk of my time analysing data and then publishing the results, usually to facilitate a decision of some sort or to show progress towards a goal.
Atlassian is actually really good in this respect, because rather than business intelligence being an afterthought, there are entire teams dedicated to building and maintaining the tools necessary to empower people to do whatever they need to do.
Mostly this means that when I want to do some analysis, all I really need to do is start up my browser, go to the appropriate internal website and start creating queries and dashboards, the wonderous lake of Atlassian data at my fingertips.
With one exception.
Atlassian has a huge store of metadata relating to things like which shard a user is currently located on, whether or not the user requires data residency and all other sorts of really interesting nuggets of information, but it's not integrated with the rest of the data lake.
I've smacked headfirst into this fact a few times in the past and each time I've chosen to work around it, either by finding another source of data or politely asking an expert to get me the data I need with their magical powers.
When it happened again recently, I decided to do something about it.
You know, teach a man to fish and all that.
Snakeskin
At a really high-level what I needed to do was get some information on regional distribution of users and the shards they occupy across a few of the services that back one of our products.
There is a nice browser interface to the metadata store, but it doesn't really work all that well with large data sets (i.e. millions of results) and there is a significant learning curve to using it for aggregations or other meaningful analysis.
The good news is that the team that owns the metadata store also provides a set of scripts that can be run from the command line to extract commonly used data, or to provide a baseline for doing your own thing.
Note that I said scripts, not tools, because it's actually a repository of Python files that requires a fully functional development environment to use.
Starting from nothing, that meant:
- Installing a package management tool (Brew)
- Installing relevant packages (Git, pyenv, Atlassian specific toolkit, vscode)
- Configuring those packages (i.e. SSH shenanigans for Git)
- Terminal configuration (i.e. .zprofile and PATH shenanigans)
I did run into two specific challenges that are worth pointing out though.
The first was learning exactly what sort of terminal I had on my Atlassian provided laptop and also learning how to configure it, because there are apparently a billion different variations, each with different instructions. I broke the terminal a few times trying to make it remember what I had installed and configured for next time, which was fun.
Thank god I could still access and edit the .zprofile file through Finder, or I would have been screwed.
The second was the pain-filled existence that was virtual Python environments. I tried to do things the right way, to setup a virtual environment specific to the local directory that contained the scripts repository I wanted to leverage, but I could just not get it working for the life of me.
It was a few hours of the following message, which I'm pretty sure will haunt my dreams decades from now:
Failed to activate virtualenv.
Perhaps pyenv-virtualenv has not been loaded into your shell properly. Please restart current shell and try again.
Anyway, I gave up and just installed Python globally because at least then I could move on with my life.
There Are Snakes This Big?
I've never actually had to use Python professionally up until this point, but I didn't need to understand it from top to tail to accomplish my goal, and the basic tenets are pretty much the same, so with the environment set up and working, the rest was pretty pain-free.
I mean, I assume I dodged a bullet by making sure that the global version of Python that I installed was the right one for the repository. If I ever need to use Python for anything else that is probably going to come back and bite me, at which point I'll need to figure out the virtual environment stuff.
The only other thing left to do was install some dependencies, which was easy enough to do via pip because the maintainers of the repository were nice enough to include a requirements file.
An hour or two of cargo cult programming later and I had a script of my very own that would extract just the data that I needed and dump it into a CSV file, so that I could work with it to do the analysis that I needed to do.
It was a few million rows, which is a little bit chunkier than Google Sheets can handle, but the Atlassian data lake allows for the import of CSV data, so I just threw it in there.
At which point I was back in familiar territory, slightly scarred from my ordeal, but wiser all the same.
This Skin Is Old, The Snake Is Bigger Now
And that's it. That was my technical adventure, which in the grand scheme of things wasn't so much an expedition into the deep dark depths of the jungle and more a breezy walk into a park near my house.
It's good to know that I haven't entirely lost all of those skills that served me well in my early career, but let's be honest, I didn't really do anything overly difficult.
Still, it was nice to get my hands dirty again. There is definitely something to be said for being able to immediately see the results of your hard work, as opposed to never really knowing that what you're doing is making a difference.
A thought still lingers though.
Was this the most valuable thing I could have done in that couple of hours?
I've always got more things to do than time to do them in and I'm never sure if I'm making the right decision when I choose to focus on something like this.
Perhaps there was some sort of figurative snake that I should have been hunting.
Or a literal one.