Road to Docker Part 1

Let me be perfectly honest, there will probably not be a part 2. Easy as docker is, there’s still a bit of a barrier to entry. I’m a software dev, I don’t really want to configure vms, but, well, devops. I don’t need to know how to do everything, but I want some understanding of how these pieces fit together. So I took one of our projects and spent some time yesterday to containerized it. I’m going to go through how I went about it.

Step 1. The Search for Tutorials

I spent a while looking through tutorials. Trying to find a good one. Spoiler: I didn’t. I was able to make use of this one mostly: It’s a very long tutorial that I was able to pick and choose some things out of. I’m really sad there isn’t a better tutorial. (That 5 minute docker tutorial is cute, but ultimately worthless.)

Step 2. Installing Docker on a Mac

I wanted to create docker containers on my laptop. It didn’t take long to discover this isn’t really something you can do directly. I wanted to avoid creating a vagrant-virtualbox just to create docker images, so I searched for the standard of what people do. The answer is they use something called boot2docker (, which, as it turns out, just just a linux virtualbox to run docker on. Go figure.

Step 3. Creating a Container — Dockerfile

Upon reading how to create a container manually, it seemed silly to not just create a Dockerfile. A Dockerfile is just a config file that allows docker to create the image. Having mild familiarity with vagrant configs and puppet, the Dockerfile is a way to keep a simple, replicable record of the docker image being created. And simple it is. The format is more readable than it’s cousin, Vagrantfile. It was a simple thing to create the Dockerfile for my Django app. ( It uses only a few simple keywords. The most useful being RUN.

Step 4. Creating the Image — build

This was just a simple docker build command

docker build -t docker_image_label .

This creates a new image, labelled with the -t param.

Step 4.5. Finding and Removing Images

docker images

This will list all images (on the boot2docker vm) which you can choose to

docker rm <name>

Step 5. Running the Image — run

This was just a simple docker run command. There are several ways to run, I went with the following:

docker run -d -p 8080:8080 --name docker_instance_name docker_image_label

This just uses -d to run it as a daemon, -p to forward the ports* *It must be noted that “forwarding the ports” is only relevant to the boot2docker virtualbox. To see the result, you should make use of the built in ip reporting for boot2docker.

boot2docker ip

Step 5.5. Stopping Docker, Killing Docker

docker ps

This will give a listing of the running docker instances, so you can

docker kill <name>

Step 6. Docker Hub

One thing that was neato to discover is that knowledge of github / git was translatable to docker. They have designed the usage of Docker Hub around “push” and “pull” concepts. It’s just a matter of pushing images up so they can be pulled down later. What I wasn’t able to find was how to actually send the Dockerfile along with the image. It’s not strictly necessary, but I’ve seen it on other projects, so I know it’s possible…


Getting the container up and running and distributable was no big deal, but this was not enough. Next steps would be to grab a mysql docker image from the hub and have the two communicate with each other. Right now I’m really just using the container as a VM, which isn’t really the point of docker.

The Silo Mentality: It Can Be A Good Thing If You Don’t Do Your Own Thing

Silos: What Are They Good For?The Silo Mentality,  as defined in the Business Dictionary, is not new nor confined to business. Silos are prevalent in all aspects of Academia, and having them around is generally considered a Bad Thing. Take a certain University in Cambridge, for example. It’s a big place. A huge place. With many different Schools, consisting of many different administrative and academic departments, each consisting of many different groups. Not to mention libraries and museums and other centers of learning. It’s a cornucopia of conflicting visions, motivations, and group mentalities, all of which are somehow supposed to work together toward a common mission of education and research (which, in themselves, are at sometimes at odds with each other). Having each entity operating by itself, looking inward rather than outward, seems a recipe for disaster.

A disaster it can certainly can be, and there is often great expenditures in time, money, and people to break down silo barriers. For those of us in Educational Technology, the proposed wrecking balls are often getting everyone onto the same Learning Management System (LMS), the same calendar, the same email, the same network, the same software and hardware  and computers and . . . the list goes on and on. For administrative functions, such as calendaring and email, this approach makes sense. It’s incredibly difficult to operate  with different calendaring and email systems. Even using different version of office software or operating systems can make a mess of things (I’m looking at you Microsoft Office for Mac and Windows!).

So I can understand why academic organizations such as University IT may want to centralize, well, everything they possibly can, including environments for software development. I’ve heard many of my co-workers and leadership say things like, “We should all be using Java” or “We should all be using Git” or “We all need to be using Agile Software Development.” At the surface level, such decisions make sense. If everyone uses the same tools and possesses a similar skill set, then you have an excellent pool of resources that can look out for each other. If one person gets sick or gets hit by that proverbial bus, someone else can jump in. You can shift people on and off projects more efficiently. Getting up to speed is faster. Yes, there’s lots of good stuff there.

Such a centralized approach may be good for a small business or even a small school, administratively. But in Higher Education, even at small schools, the breaking of barriers argument quickly implodes when it comes to academic needs. This is because Higher Education is, by definition, about academic freedom, and it’s this freedom, as ironic as it may seem, that builds the silos in the first place. The building of silos, in fact, is inevitable and necessary.

As an example, take software development for pedagogical tools, an area my group is intimately familiar with. Unlike many of the software development groups within our parent organization, we do not focus on the development of enterprise-wide applications that cater to a common need across multiple organizations; we develop software on a much smaller scale over a much more diverse range of needs and, hence, technologies. Additionally, we are responsible for helping faculty and students stand up their own software, which usually involves a different technology stack than our own. So unlike some of our counterparts, we cannot be confined to a single language such as Java or Python. We cannot be confined to single deployment infrastructure. Such confinement inhibits the academic freedom of students and staff. We are, by necessity, different from other groups. Separated. A silo.

This is not a Bad Thing. Our educational technology group needs to operate differently from enterprise groups–such as, say, the Registrar–just as the enterprise groups need to operate differently from us. Maintaining these silos is a Good Thing. It’s isolating the silos from each other that causes problems.  Since our group is building course ware, we are often interested in accessing student data. It would make zero sense for us to have our own set of student data when we should be utilizing the information managed by the Registrar. If we did our Own Thing, that would be isolation.

It’s isolation of silos that causes people to start waving around those wrecking balls of centralization, but removing the silos all together is missing the point of academic freedom. Rather, bridges need to be built between the silos, conduits of communication and best practices that allow the silos to act as a cohesive team. So how does one go about doing this? We’ll explore some possibilities in  an upcoming post. Maybe. Assuming I write it.

Posted in Uncategorized. No Comments »

Topics in Version Control

versioncontrol-allthethingsThis last month I put together a presentation on version control. As far as I’m aware, everyone I work with uses some form of version control. My purpose with the presentation was many fold. I and a lot of thins that had been prrcelating that I wanted to think about and express. This as an excellent opportunity for that.

Git vs SVN

One thing I know is a sore point for some developers is the idea that git is taking over. There are a lot of people espousing the idea that git is better than svn in every way. This has created a conflict culture because there are a lot of developers making good use of svn and most have heard the primary arguments in favor of git and found the arguments are not compelling enough to motivate a switch.

This is okay.

There is no war on centralized version control, and if there is, the only people pushing it are dumbasses. (This coming from someone who was one such dumbass a couple years ago.)

It’s anti-agile to prescribe something like that. Each team works differently and needs to determine for themselves if the benefits outweigh the costs.

Continuous Delivery

Recently I read a book called continuous delivery. It is a comprehensive blowhard description of how devops “needs” to be. It covers testing strategies, and branching strategies. While a lopsided view of the world, it offered potential solutions that I have found fascinating.

Trunk Based Development vs Feature Branching

This offers not just a workflow that allows for potentially faster delivery, but also a realistic way to use SVN. An alternative to the branching hoopla around distributed systems.


It also gave me an opportunity to talk about DevOps in a reasonably safe environment. It’s a political word at the moment and while I don’t want to get involved in that, I do want to talk about the DevOps movement.

This movement is about working together. Communicating on a level that hasn’t classically been done. It’s there because a lot of the problems we face are due to the fact that in our workflow, we develop a product and throw it over a virtual wall to operations to “handle”. They don’t understand what we need done and we don’t understand what they need done. It’s a matter of sharing knowledge and searching for a deeper understanding of what we’re doing. It’s hard because we don’t want to spread ourselves thin, but in order to deliver a quality product, you have to be more than just a cog in the machine. You don’t have to know exactly “how” to do everything, but you have to have a basic understanding of “what” is happening.

But I digress, I was able to tie this in to version control by talking about how DevOps is also concerned with automation. Creating scripts to do all parts of the deployments. These scripts, this automation needs to be in version control as well. They “can” be in the same repository, or they can be in a separate repository. But managing dependencies, flipping all the right switches, needs to be maintained in much the same way code is controlled. Because it is code.

Database Version Control

And of course, you can’t talk about Version Control without talking about database version control… but this one requires its own article.

Cloud Delivery Platform — Thought Model

I was asked to take a look at a thought model for a Cloud Delivery Platform. I’m not going to post that graphic, because it wasn’t ready for prime time. It was confused and I’m not really sure what it was trying to show. So I created a bit of a diagram showing what I thought was important. (Just click on it, it doesn’t seem to fit right, and the next size down is too small to read.)


Halfway through I realized I’m not terribly good at making these things. I’ll get better. But there’s a few things I wanted to draw attention to in this graphic.

The Roles

This was the most important thing to me is to highlight the crossover of roles for each step. There are few steps in the process that should be perfomed by a single role. Devops is about communication.

Local Development

Coding, building, testing, packaging are all things that should be done at the same time. You don’t code 1000 lines then build. You build with every small change. In the same way, you should have tests written with the small increments. And that gets extended to “packaging”.


I’m probably missing a very important word in my vocabulary here, but I’m going to go easy on myself, it’s late. What I’m meaning here is the process of setting up an image, or a deployable product. The configuration that will collect external dependencies and run tests when it gets to the deployed environment. This should all be done locally first. It should all be put into version control, separate from the application code. And this needs to be done in collaboration between the developer, who knows the application, and operations (or as they’ve been renamed in my organization, “Devops Engineers”) those who know how to package things appropriately.

Local Concurrence vs Cloud Linearity

Why isn’t spell check underlining “linearity”. That can’t be real. As I mentioned before, the things done locally are all done at the same time (build, test, package). That’s all just development. When it gets to the cloud, it’s a done product. Nothing should be further developed there. Nothing should be getting reworked. Everything needs to come entirely from Version Control and no manual finagling. (obviously?) So the cloud portion is linear. Everything happens in order. If it fails at any point, it’s sent back to local dev and fixed there.
Maybe this is all so obvious, it doesn’t need to be said. Maybe I’d feel better if it were just said anyway.

Jira vs Github :: Agile vs Open Source

A few years ago, my team got into Open Source. Specifically, we started writing all of our apps on github (as opposed to our SVN). We wanted to do this because we wanted to invite scrutiny. We never expected people to look at our stuff, we just felt that by putting it out in the open, we’d want to do better internally.

We went whole-hog. Organized stories with issues, organized sprints with milestones. It was pretty hot stuff. And it was all in the open. Potentially, someone could come by, see what we’re doing and offer to take a story from the backlog of issues. That’s open source. ish. We have a lot more we can do. A lot of growth to do.

Then we started using Jira. The board system within Jira Agile was excellent. Allowed for better organization, reporting!, and visual representations of work. It’s great. It’s Agile. But it also replaced what we were doing with Github issues.

We essentially replaced Open Source in favor of Agile. Our organization is great, we’re keeping track of things fantastically, but we’re no longer open. We don’t have transparency on what we’re working on anymore. People can’t potentially help. Our code is out there, but we’re not inviting. Our process is no longer out there.

So what’s our solution? We don’t have one yet. But what /can/ we do?

We need to put our vision statement out there. We need to put our plans out there. We need to expose what it is we’re doing. We also need to stay agile, keep our tools intact, keep our reporting.

This means we probably need to be duplicating efforts. Open Source and Agile are both hard work and organization. That they can’t line up and be the same effort is not a blocker, just an “oh well”.


enterprise_1920x1200It’s goal setting season and everyone is reviewing the organization’s FY15 goals. Understanding the organizational goals is important because we all want to be sure to be in line with the direction of the organization.

#4 on the top 10 is “Establish an enterprise IT architecture”. This is a phrase that I’ve never fully understood. So I’ve been asking people to define “enterprise”, I get all sorts of answers. “big” “important” “java / oracle / peoplesoft” “something that ‘can’t’ fail” “finance” “HR” “business”. (Important was my favorite.)

People use this word constantly. There’s even groups in my organization that are called the “enterprise groups”. I was told recently that there’s “less dysfunction in the enterprise groups”. To which I had to ask “which groups are those?” (Answer: They’re the ones that have significant business value.) But the fact that I can’t find someone to give me a coherent definition is a warning sign that it has become a meaningless buzz word.

But business is the right answer. Enterprise seems to be anything that encompases the entire organization (or a significant portion of it) or something that has significant business value.

Anyway, that’s neither here nor there. What I think someone was trying to say with “enterprise IT architecture” is a standardization of technology across the organization — often a pipe dream. And I think this goal tends to draw extra attention to the groups that are already considered the “enterprise” groups.

But I’m begging a deeper question here. Shouldn’t everyone be considered enterprise? If enterprise is just a fancy word for business. If you’re not providing significant business value to your organization, then maybe you need to rethink what it is you’re doing. If you’re not considered enterprise, then someone doesn’t think you’re providing business value, effecting enough people or having enough impact.

So when “establishing an enterprise IT architecture”, I’m taking that to mean communicate with other groups, figure out what they’re doing and share what you’re doing. Meeting that goal is potentially impossible with a large silo’d organization. But working toward that goal is ambitious and requires a lot of time. Working towards that goal is just implementing good collaborative practices. Simple but not easy.

The Code Kata

The idea of code kata is not a new one. This has been floating around for a while, but I didn’t make the connection of its importance until recently.

The idea is this: doing your work is not the same as practicing what you do. The analogy I like the most is the chess analogy. A grand master cannot be a grand master by simply playing in tournaments. In fact, just playing in tournaments would make a pretty poor player. Studying tactics, reading, discussing (with other human beings that have their own unique perspectives) outside of the game allows personal growth and an understanding of the game that is unattainable by simply playing it.

word_document_74755743_original_96681a2843Maybe that analogy speaks to me because I understand games and being great at them vs just playing them.

I spent some time misunderstanding what a code kata needed to be. For the longest time I had understood it to be technical growth via solving algorithms or learning new technologies. These could be katas, but what makes katas doable is 2 things.

  • Simple.
  • Interesting.

Keeping them simple allows you to be able to make time for them. Google the FizzBuzz test. Even something that simple can keep your mind thinking like a developer.

Keeping them interesting allows for you to care about them. And caring about them will ensure they actually get done. If you’re not interested in learning node but you set yourself up to learn it as your kata. One, that’s a big task, but more importantly, if you’re not interested, you’re setting yourself up to fail. This is a great way to find what it is you’re interested in and focus more.

Myself, I seem to currently be interested in best practices and the social aspects of software engineering. So I’ve decided to increase the blog posts I do. From the beginning of these, I’ve been focused on doing at least one a month. And that’s been a way for me to think through some of things I’ve worked on in the last few years. By writing them out, it’s been a rear way for me to cement the things I’ve been doing in my head and analyze them a little more carefully. This, I believe, will help me be better at my job.

Yes, I blog for me, not for any kind of readership. That’s helpful since no one reads these :)

Posted in Uncategorized. 1 Comment »

Big on Big Data

Of major interest in Educational Technology is Big Data, which is a rather all-encompassing term for the vast mountains of information that our computers are collecting and how we humans might take advantage of it. The 2014 NMC Horizon Report for Higher Education highlighted this interest, noting that the “Rise of Data-Driven Learning and Assessment” and “Learning Analytics”, both of which revolve around the anlysis of Big Data, are technological trends that may have profound impact on education as we know it. A primary goal of several MOOCs, including edX, HarvardX, and MITx, is to research the data they collect from thousands of world-wide participants. And here, at Harvard, a main objective of the new Teaching and Learning Technologies program is to research all the data our new Learning Management System and associated tools will provide. Finally, our own little group, the Academic Techngology Development team, is starting to research how we can best analyze the data our tools collect.

Who r u?The main drive behind this interest in Big Data–at least, in Educational Technology–is the belief that through analyzing all of this information, we can create a better teaching and learning experience for instructors and students. The idea goes something like this: Alice, a student at Wonderland University, takes a course on existentialism from the Caterpillar. As Alice completes her assessments, the results are stored online, and through careful analysis of these results, the Caterpillar is able to deduce which concepts Alice is having difficulty understanding, and in which conecpts she is excelling. The Caterpillar can then tailor Alice’s future work to her needs, thus creating a better learning experience for her.

That’s the simplified version of the tale; many other permutations exist. For example, the Caterpillar may recognize that another student, Tweedledee, excels in one area that his brother, Tweedledum, does not, and decides to pair the two up so that Tweedledee can help his sibling. Or, perhaps, the Caterpillar discovers that his students grasp the concepts of mushrooms better when he discusses the concepts of fungi first, and adjusts his future syllabi accordingly.

Regardless of the scenarios, the promises of Big Data rely on the firm belief that something meaningful exists somewhere in all of that information, and that if we extract that meaningful something, then we can do something meaningful with it. Sound a bit ethereal? It’s not, as commercial companies such as Amazon, Netflix, and Google have uncannily demonstrated with their search and recommendation enginees. And in regards to applying Big Data to education, there’s a large field of research going on, not to mention a growing field of application and practice.

Grow Developers

At OSCON, I attended a lot of great sessions. One that had an impressive impact on me was Grow Developers. It was not what I expected. I only read half the description and it seemed like it was right up my alley of “developing developers”. That is, helping developers get better.

However, as soon as the talk started, I was shocked to discover it was actually a women in technology talk. At first I was disappointed, then I looked for the door and realized I sat in the front and would have to walk through a room full of women… That would be awkward. So I got comfortable and braced myself for a talk I didn’t think I wanted to hear. I mean, yes, there’s not enough women in technology, that sucks. It’s probably unfair because of cultural perceptions, but I was at OSCON to learn about technology and awesomenesses and such, not gender equality.

But it wasn’t so bad. In fact, the things covered in the talk were not altogether gender equality specific, and could be abstracted into exactly what I wanted to hear about in the first place.

Better Job Descriptions

One super important point that was made was that job descriptions need to be better. Too often jobs are posted, and people are clearly qualified. Their resumes perfectly match the job description, but they never get a call back. This often happens because what people are looking for specifically isn’t included in the job description. But the fact that someone is weeded out when they appear to be a perfect fit on paper tends to lead them to believe there is something else keeping them from the job. Personally, culturally. Solution to this is better job descriptions and better communication from HR. This is something I’ve felt 100 times over.

Stop Interrogations

Ever been on an interview where you had to justify yourself or answer absurd CS questions to a room full of people? This is bad enough as a guy justifying himself to 3-10 other guys, it’s a lot worse when it’s a woman who has to justify herself to a room full of guys. Something that we’ve all had to endure in this profession. It’s like right off the bat, being disrespected. By respecting people, wanting them to come in and impress, a step towards that is to make them feel comfortable. Make them feel like you want them to succeed.

wtfs per minute


I wasn’t familiar with this word until the show Silicon Valley. Since then, I thought it was a joke of a word. Representing a culture that didn’t really exist. Like the frat-guy programmer. But apparently I misunderstood? It’s more about the way men communicate. We are sometimes hostile to each other. An example can be made of code reviews, it might be common for a lot of people to say “your code sucks, wtf”. In fact that reminds me of a code quality comic courtesy of Jeff Atwood. This is not really a welcoming culture for anyone. Without an “in” attitude, people (not just women) aren’t going to feel welcome, and it’s going to distance people who aren’t already in this culture.

Solvable Problems

These are solvable problems with our culture. Work with HR, don’t gang up on people, be a little more sensitive to other people. Be better.

Integrating Jasmine with Travis CI

One of the things I’ve been wanting to automate with our Harmony Lab project is the javascript test suite so that it runs via Travis CI. I tried once before, but hit a wall and abandoned the effort. I recently had the opportunity to work on this as part of a professional development day in ATG, which is an excellent practice that I think all departments should embrace, but that’s a topic for another day. If you’re not familiar with Travis, it’s a free continuous integration service that provides hooks for github so that whenever you push to a branch, it can run some tests and return pass/fail (among other things). Getting this to work with a suite of python unit tests is easy enough according to the docs, but incorporating javascript tests is less straightforward.

Harmony Lab JS tests use the Jasmine testing framework and all of the unit tests are created as RequireJS modules. This is nice because each unit test, which I’ll call a spec file, independently defines the dependencies it needs and then loads them asynchronously (if you’re not using RequireJS, I highly recommend it!). Running the Harmony Lab tests locally is a simple matter of pointing your browser to http://localhost:8000/jasmine. This makes a request to Django which traverses the spec directory on the file system and finds all spec files, and then returns a response that executes all of the specs and reports the results via jasmine’s HTMl reporter. But for headless testing, we don’t want to be running a django server unless it’s absolutely necessary. It would be nice if we could just execute some javascript in a static html page.

It turns out, we can! The result of integrating jasmine with phantomjs and travis CI is Harmony Lab Pull Request #39. You can check out the PR for all the nitty-gritty details. The main stumbling block was getting requirejs to play nicely with phantomjs and getting the specs to load properly. The phantomjs javascript code, that is, the javascript that controls the browser, was the simplest part since it only needed to listen for the final pass/fail message from jasmine via console.log and then propagate that to travis.