At Xebia, we implement numerous internal initiatives. One such initiative involves developing a toolset for assessing a customer’s situation. This toolset (working title: Truffleswine) allows us to retrieve relevant data from systems quickly, which in turn helps us ask the right questions sooner and clarify business cases for improvement using actual data. For a more detailed description, visit here.
However, the purpose of this blog is not just to describe the toolset, but to share our development approach and some valuable lessons I (re)learned along the way.
Our development approach
Here is an overview of how we introduce new capabilities into the tool:
We usually start with a specific customer need, such as specific data or a visualization. We often begin in a Jupyter notebook, focusing less on reusability and testability, and more on getting the job done quickly for that single use case to discover its usefulness.
Once we are confident, we consider how we can integrate the feature into the system. Without a clear design, we use pair programming and test-driven development (TDD) until we are satisfied, focusing on the desired API first. This typically concludes with the removal of the original code.
We try to integrate often, so features in discovery are also merged in our main branch. It’s worth mentioning that we do work with Pull Requests, but we have chosen to make peer reviews optional because:
- We have stringent automated checks with tests, coding style, dependency checks, and coverage.
- We each tend to review changes of interest asynchronously from the pull request. We are not dedicated to full-time development of the tool, and contributors can merge code without having to wait.
- When possible, we pair program (review at real-time)
- The tool is an innovation project. Changes are easy to make and mistakes are easily forgiven. We don’t need 100% uptime at this point.
Postponing decisions is great
- Making a tool a product (transitioning from ‘a simple script’ to something anyone could potentially use easily) is a significant investment. However, limited resources and people helped us focusing on the most important things first.
- Instead of getting stuck in analyzing what the design could or should be, I’ve learned to ask: “Do I really need to make that decision now?” – The answer is often “No”. This allows me to focus on what’s important – if it the design decision is important, it’ll pop back up later.
Overengineering slows down
We have overcomplicated things in a few instances, which we later decided to remove. The most significant ones are:
- We didn’t need a React front-end: We initially tried to replace our Jupyter notebooks with a React front-end connected to an OpenAPI spec on the server. We quickly realized that this required more work to achieve the same flexibility and further extension, at the expense of innovation power and development capacity. Moreover, we didn’t have any “non-expert users”.
- We didn’t need a database: An earlier version used an ORM, anticipating a point where things would be placed in a database. We started with SQLite, intending to switch to something else later. However, as we run the tool locally, plain CSV files (individually up to 1GB) seemed good enough and were more convenient. We ditched the database and ORM, although we might revisit this decision later on.
- We didn’t need type-safety everywhere: Partly because of the choice to use an ORM, our data classes were robust and type-safe. However, as we shifted focus towards analyzing (and transforming) the data, we began to favor more generic pandas DataFrames over our own data classes.
Small projects are great for personal development and experimentation
- These type of inner sourced projects are a safe place for people to collaborate – we’re all hard working people and we don’t blame anyone for things going wrong or going slow.
- We try new development tools that we cant always use at our assignment at a customer. For example, GitHub Copilot is incredibly helpful when writing code. When I don’t know the exact syntax or API (for libraries like ‘plotly‘ or ‘pandas‘), I start typing (often as a comment) and see what code Copilot suggests. It often produces the right code, and if not, it helps me know what to search for in the documentation.
- I continue to find great value in TDD-ing code and code that has been TDD-ed.
- Nothing else provides the same focus, instant feedback, save-points, and the ability to avoid making all decisions at once.
- When changes are easy to make, not all decisions feel like a commitment. This enables me to try things out and learn, which is my main motivation for working on the project.
It’s sometimes easy to forget how satisfying and fun writing code can be. Particularly in complex organisations where things get easily stuck on internal processes and dependencies. These small projects help questioning what kind of complexity is really necessary. Things get a lot more straightforward when you’re improving your own life or that of the people directly around you.
In my case, peer consultants, and I guess the official term is improving the "ConEx".