Improving Wetware

Because technology is never the issue

Craftsmanship and Engineering

Posted by Pete McBreen 01 Oct 2021 at 17:23

Found a recent reference to my book by Hillel Wayne in a blog post where he interviewed engineers who moved into software development to get their take on the “is software development engineering” question. Overall his take is that

there’s a much smaller gap between “software development” and “software engineering” than there is between “electrician” and “electrical engineer”, or between “trade” and “engineering” in all other fields. Most people can go between “software craft” and “software engineering” without significant retraining. We are separated from engineering by circumstance, not by essence, and we can choose to bridge that gap at will.

I have to agree with Hillel that despite having a masters degree in Manufacturing Systems Engineering I am not a traditional engineer.

I believe everything McBreen said about software is fairly reasonable, about how hard it is to predict things and how it’s intensely personally created. What he gets wrong is the assumption that engineering is not this way. Engineering is much richer, more creative, and more artistic than he thought. But of course he would have an imperfect view: he is, after all, not a traditional engineer. Are we really engineers

Upgrading from Selenium to Playwright

Posted by Pete McBreen 16 Sep 2021 at 22:37

For a long time Selenium has been the default option for writing automated browser tests, but there are alternatives now available. One leading contender is Playwright, and while it does not have as much language support as Selenium, it does cover the main options (JavaScript, Python, Java and .Net) while providing full browser support.

With a large suite of automated Selenium tests, it might not make much sense to rewrite existing tests to use Playwright, but for new test suites it might make sense to switch over to Playwright. It has better affordances for testing, has easy control over the browser contexts without that getting in the way for normal usage and deals with single page applications and websockets. The python samples below are equivalent (other than playwright does not show browser by default).


import pytest
from playwright.sync_api import sync_playwright

def browser():
    playwright = sync_playwright().start()
    browser = playwright.chromium.launch()  # use params headless=False to see browser

    yield browser


def test_blog_title(browser):
    page = browser.new_page()

    # find the blog link in the header and then click on that link"#main_navbar :text('blog')")
    title = page.title()
    assert title.startswith("Blog | Selenium"), title

    h1text = page.wait_for_selector('h1')
    assert h1text.inner_text() == "Selenium Blog"


import pytest
from selenium import webdriver
from import By

def webbrowser():
    firefox = webdriver.Firefox() # default is to show the browser, use options to make headless
    yield firefox

def test_blog_title(webbrowser):


    # want the blog link from header, so two stage process to get correct element
    header = webbrowser.find_element(By.ID,'main_navbar')
    blog = header.find_element(By.PARTIAL_LINK_TEXT , 'Blog')

    title = webbrowser.title
    assert "Blog | Selenium" in title, title

    assert webbrowser.find_element(By.CSS_SELECTOR, 'h1').text == "Selenium Blog"

Slow decision making as a failure mode for Scrum

Posted by Pete McBreen 09 Sep 2021 at 21:03

Although Scrum can operate on with one to four week sprints, most organizations currently choose a one or two week duration for their sprints. This limits how long a team can take to make decisions, since with only five or ten days in the sprint, taking a day or two to organize a meeting to discuss the decision will impact delivery.

Some common failures that I have seen include:

  • Team members letting an open Pull Request sit around for a day or more because they are too busy to review the changes in the PR
  • Developers not immediately merging approved Pull Requests so the changes are delayed getting to the test environment
  • Database changes that impact a few tables needing to be discussed with the DBA who is not available until later in the week
  • Defects reported by testers sitting waiting for a developer to have the time to review report

All of the above are examples of extra time incurred in the process of taking an item from the backlog and moving it to the Done state. This increases the likelihood that there will be a rush to complete items at the end of the sprint, or that the overall sprint goal will not be met.

The fix I see for this is to build slack into the sprint so that the team is not busy all the time. Aiming for 100% utilization always leads to queuing and delays, so the team has to build in appropriate slack time to prevent decision delays.

Note. Martin Fowler has a possible fix for delays caused by the Pull Requests needing reviews.

Microservices, the Cloud and Us

Posted by Pete McBreen 24 Aug 2021 at 21:04

In a presentation on Modular Monoliths Simon Brown had a great comment on Microservices

If you can’t build a modular monolith, what makes you think microservices is the answer?

Or as Architect Clippy said it

I see you have a poorly structured monolith. Would you like me to convert it into a poorly structured set of microservices?

Marcus Ranum has had similar thoughts

I love it when software developers say “How hard can it be?!” and decide to build their own complete replacement system. The results are usually about as bad as the first system, for the same reason. To be fair, this stuff is really hard to write – which is all the more reason to be skeptical when someone says they’ll just put together a modular cloud-based version of their own. You should always ask “why do you believe you will get right the things that everyone else got wrong? Because the reasons that they got it wrong apply to you, as well.”

Overall it seems that although the cloud and microservices are seen as synonymous, microservices bring with them a lot of extra complexity and interdependencies that cost time and effort to resolve and work around. Justin Etheredge has written that it is all about scale

The problems I’ve dealt with are not at the scale of Google, Facebook, or Uber.

Few organizations have to deal with the problem of how to coordinate more than 100 developers. Yes, Spotify and Amazon have that sort of problem, but most of us do not. So it is therefore very likely that the solutions that work for their scale of problem might not be the best ones for the rest of us.

An example that I have seen recently is that of a web application that was moved from the old style LAMP stack to an Angular single page application front end supported by a REST API at the backend. So now a change to the requirements meant that the release of the typescript changes to the angular pages had to be coordinated with the release of the REST API changes (which needed to be versioned in case there were old clients using the prior version of the service).

Revisiting "Applying the Lessons of eXtreme Programming"

Posted by Pete McBreen 11 Aug 2021 at 17:37

Way back in 2000 for the TOOLS34 conference I wrote an article Applying the Lessons of eXtreme Programming and it was interesting to look back at it to see how things have worked out since then.

  • Applying JUnit – the unit test frameworks as not as widely utilized as I expected, but now all modern programming languages ship with a flavor of unit testing built in
  • Be intolerant of process variations and working off process – many teams have not learned this lesson and have problems arising from this
  • Use a coach to keep the team on process and improve individual skills – this does not seem to have caught on, even in teams that use Scrum, the scrum master role does not always enforce process
  • Apply the Quality First Strategy – I still see to many teams that allow code to be merged without adequate tests that later prove to contain incorrect code
  • Continuous Integration is the only way to avoid Integration Hell – I’d soften this to a way to avoid, and many teams still have problems making this work
  • Tired humans produce lousy software – still true, and even agile teams sometimes forget this
  • Incremental Development requires Incremental Requirements Capture – still true, some teams are getting good at this
  • Simple Designs are easier to maintain and evolve – in the era of microservices, we seem to be forgetting this lesson. Yes the microservices themselves are simple, but the interaction between multiple microservices are really hard to maintain for a lot of teams
  • Adjust your development process slowly and measure the effects of the change – I still think this is a good lesson from XP, but all too many teams are not very good at this
  • Many projects would benefit from a good dose of Reality Therapy – still true unfortunately

Hyperloop Delusions

Posted by Pete McBreen 25 Jun 2021 at 23:03

Now the Calgary - Edmonton link is dreaming about a hyperloop, Alberta’s version is called Transpod. Did no reporter look at the image and ask any questions?

  • The pretty photoshop picture shows a transparent tube - that is going to work well as a structural material to contain a vacuum
  • These transparent tubes are called “magnetic tubes” – there are very few transparent materials that are able to generate or support strong magnetic fields
  • Speeds of up to 1,000 kilometres an hour, that is going to need a pretty hard vacuum to prevent the pod from generating a massive pressure wave in front of it as it travels at that speed
  • The picture shows nothing in the way of pumping infrastructure to maintain the vacuum
  • How long will it take to pump the vacuum down at each end of the line?
  • What provision is there for passengers to escape from the pod in the case of an issue? It is not like you can walk through a vacuum
  • Why did nobody ask about all the other vapourware hyperloops that have been in development? Nobody has yet demonstrated anything beyond a toy prototype.
  • Why did nobody as why not just build a high speed rail line for passengers and cargo?

SWEBOK 3 is out

Posted by Pete McBreen 22 Apr 2021 at 20:51

Not had a chance to read it in detail but one section stood out

Parameterized types, also known as generics (Ada, Eiffel) and templates (C++), enable the definition of a type or class without specifying all the other types it uses. The unspecified types are supplied as parameters at the point of use. Parameterized types provide a third way (in addition to class inheritance and object composition) to com-pose behaviors in object-oriented software.

I would have thought that this updated reference would use more modern languages for the examples. Ada might still be in use, but Eiffel never really caught on, might have been more relevant to use Go, Rust or even Java as examples in this context.

PlantUML has a good way of visualizing JSON

Posted by Pete McBreen 24 Jan 2021 at 03:55

Ran across this while looking at for a way to visualize the architecture of a system.

PlantUML now supports JSON in that it can draw a diagram that reflects the structure of the JSON, and it does it relatively simply. The #highlight “phoneNumbers” provides highlighting on the image, and even better, you can use markdown like highlighting inside the JSON to do the usual bolding as per the firstName at the top of the JSON.

#highlight "phoneNumbers"
{"**firstName**": "John","lastName": "Smith","isAlive": true,"age": 27,"address": 
{"streetAddress": "21 2nd Street","city": "New York","state": "NY","postalCode": "10021-3100"},
"phoneNumbers": [{"type": "home","number": "212 555-1234"},
{"type": "office","number": "646 555-4567"}],
"children": [],"spouse": null}

Then simply execute plantuml to generate the image

> java -jar plantuml.jar -tpng json.txt

Resulting Image that can be rendered into most formats that you might be interested

JSON Image

This is a great example of the concept of Diagrams as Text as it gives a source that is easy to diff while still allowing an easy to view presentation format.

Another take on Engineering vs. Craftsmanship

Posted by Pete McBreen 19 Jan 2021 at 13:02

Hillel Wayne has an interesting take

Many people have asked me why I care so much about this project. Why does it matter whether or not software is “really” engineering? Why can’t we just say that “software is software”? It’s because of these misconceptions. People have a stereotyped notion of what engineering looks like. Because software doesn’t look like the stereotype, they assume that we are wholly unlike engineering. The engineering disciplines have nothing to teach us. We are breaking pristine ground and have no broader history to guide us.

Is Serverless a return to the days of batch mainframe processing?

Posted by Pete McBreen 21 Dec 2020 at 23:48

Cees de Groot seems to think so

You deploy, get an error message, and login to CloudWatch to see what actually happened - it’s all batch-driven, just like the bad old days, so progress is slow. At least we’re not having to walk to the printer on every try, but that pretty much sums up the progress of the last half century. Oh, and a “Function” will be able to handle a single request concurrently, so here’s your AWS hosting bill, we needed a lot of instances, I hope you won’t have a heart attack. Yes, we run workloads that your nephew can run on his Raspberry Pi 4, but this is the future of enterprise.

Another take on computer security

Posted by Pete McBreen 21 Dec 2020 at 04:48

In Lost In The Clouds Marcus Ranum started off by saying

Back when I worked in security, I regularly encountered things that just left me shaking my head, “why would anyone want to do this?” It made me feel increasingly distanced and out of touch with the industry/community, as the decision-making herd went thundering off over the horizon, ignoring the sign that said “cliff.”

The recently publicized SolarWinds Breach has a good discussion on transitive trust and the issues that can arise from that

The entire software ecosystem is one great network of relationships, virtually any of which can be lashed into a transitive trust attack. When you install the device driver for your graphics card, your Windows desktop system checks the signature on the driver and allows it to run in kernel space, with complete unrestricted access to system memory, the devices, and the CPU. But who wrote the driver? Possibly a consultant. Possibly the manufacturer. […] what if the programmer at the vendor who is writing the driver decides to use some XML parser code from some open source software repository. Do you think they read through the parser code and check for backdoors?

More from Marcus on the SolarWinds Breach

In fact it sounds like SolarWinds was a fairly typical software development shit-show. Developers sometimes feel that being smart is all that’s necessary to build secure, well-architected systems and networks. Too bad they’re wrong. I have heard development managers non-ironically say, “our guys are really on the ball and I know they monitor the code repository carefully” so that’s good enough – there’s no need to worry about someone putting code in some library that one of the developers just lifted from some open source software archive. Hint to would-be hackers: write a pretty graphing package and put a few extra nudge-nudge features in it and you, too, can pwn a ton of development shops.

Arecibo Observatory is no more

Posted by Pete McBreen 20 Nov 2020 at 04:43

Sad news about our ability to maintain things, the renowned Arecibo Observatory in Puerto Rico has been deemed unrepairable.

The telescope was built in the 1960s with money from the Defense Department amid a push to develop anti-ballistic missile defenses. In its 57 years of operation, it endured hurricanes, endless humidity and a recent string of strong earthquakes.

One of the auxiliary cables snapped and tore a hole in the reflector dish, and more recently one of the supporting cables failed. Although the claim was made that the maintenance procedures had been followed, I have to take it as another example of infrastructure that is slowly crumbling.

One take on the state of EdTech

Posted by Pete McBreen 28 Jun 2020 at 23:06

The 100 Worst Ed-Tech Debacles of the Decade is a good hint about what the author thinks of the state of EdTech.

Unfortunately after reading through the list of items, it is hard to conclude that things are getting better. Clickers, e-textbooks and gamification are nowhere near the worst part of the EdTech problems, but they are still being adopted…

On top of this, people are still trying to claim that there is a STEM Crisis, but the reality appears to be that it is hard to get people to apply for STEM jobs that pay barely more than fast food service jobs.

University Leadership...

Posted by Pete McBreen 25 May 2020 at 22:16

Chronicle : The pandemic reveals ineptitude at the top

How does a university with a $6-billion endowment and $10 billion in assets suddenly find itself in a solvency crisis? How is one of the country’s top research universities reduced, just a month after moving classes online, to freezing its employees’ retirement accounts?

But a university is not a corporation that must maximize its profitability for the next quarterly earnings call. It is, or should be, an institution with far longer time horizons. Johns Hopkins has weathered two world wars, a Great Depression, a global flu pandemic, and multiple economic crashes, the last barely a decade old. Some American universities are older than the nation itself. These institutions exist for the long term.

As time passes, only COBOL lasts

Posted by Pete McBreen 05 Apr 2020 at 02:59

I heard that statement from Trygve Reenskaug over 20 years ago at a conference, and amazingly it is still true. New Jersey is trying to hire COBOL programmers to fix a 1980’s vintage unemployment claims processing system.

Supposedly due to have been replaced shortly after the Y2K issues, the system is still operational and in need of a few good COBOL Programmers.

It took a bit of searching, but I mentioned Trygve Reenskaug in an InformIT article I wrote all the way back in 2002 on “Design for Maintenance”.

The sorry state of software in many organizations is attested to by the way that people talk about “legacy systems.” Nobody seems to be excited about working on a legacy system, even those that are mission-critical or that handle the bulk of an organization’s revenue stream. Sometimes it seems as if no one wants to work on legacy systems, except maybe as a precursor to replacing those legacy systems.

The problem is that many organizations have let their mission-critical systems fall into an abysmal state where nobody in the organization really understands these legacy systems any more. Even worse, the organizations have failed to train their developers in the technologies they need to know to look after these mission-critical applications. No wonder it takes forever to get a simple change made on these mission-critical systems—nobody in the organization knows how to write COBOL. It would be laughable if it weren’t so sad.

As Trygve Reenskaug once said, “As time passes, only COBOL lasts.” The reality is that in the 1970s and 1980s lots of mission-critical applications were written in COBOL or similar vintage languages such as Assembler, FORTRAN, PL/1, and RPG. Even now, in the age of the Internet, Java, and web services, most companies are still dependent on applications written in these “legacy” languages. The Y2K fiasco didn’t kill off all these mission-critical applications; they’re just as important as they ever were.

How expensive is the cloud?

Posted by Pete McBreen 25 Feb 2020 at 04:57

Although AWS and similar solutions offer a low cost of entry, with typical systems involving 100s of machines, maybe the cost of the cloud is not as trivial as we first thought…

I was initially prompted to write about this when I noticed developer laptops specifications being drastically upgraded to support 10+ virtual machines in order to support running a small part of a microservices architecture under development. Yes, a developer could run one part easily on a normal laptop, but in order to test out even simple scenarios, 10 different VMs needed to be configured and started up. In contrast a Elixir/Phoenix/PostgreSQL development environment can run on a relatively cheap, low memory laptop.

Marcus Ranum recently wrote about being Lost in the Clouds

“In its most basic form, cloud computing allows you to transfer some risks around, and that’s it. It saves you money if you choose well and can aggregate services with other customers and avoid lock-in, and it allows you to fire those pesky system administrators who used to manage your storage array. Instead of capital expenses for salary and desks and hard drives, you can pay more for something you don’t own which is slower and out of your direct control.” [Emphasis in the original]

Andreessen Horowitz in The New Business of AI, –hat tip to Scott Locklin – have reported that

“… these forces contribute to the 25% or more of revenue that AI companies often spend on cloud resources. ”

The problem is that running an AI model typically takes a very large data set and a lot of processing, both of which are expensive in the cloud. While these expenses are operating costs, it can seem to be cost effective, but as Ranum points out, it might be that purchasing machines and hosting them onsite is a cheaper and faster solution.

My take is that the cloud is reinventing the big-iron era of the mainframe, where companies end up paying by the hour for processing time, storage and network bandwidth. Yes, there may be some circumstances where this makes sense, but when you need to start hiring administrators to configure and manage your cloud, maybe it is time to to consider the alternative.

Affordances in version control systems

Posted by Pete McBreen 07 Nov 2019 at 18:32

Fossil and git have subtly different affordances while making a changes to a codebase.

fossil commit takes all changed files in the directory and commits those changes. This means that all files that have been edited in the project are committed at the same time, so a developer has to make sure that during a session, the changes made in the project have to be related to the current change/fix.

git add, aka. git stage adds a modified file to the index so that it will be included in the next commit. This means that you can modify multiple files and then selectively choose which files to commit. So a developer using git can work on multiple different changes in the same session and then selectively choose which to commit as a group.

Overall this difference in affordance means that a fossil user has to stay focused, but when using git a user can work on multiple different fixes concurrently.

Are we getting better at estimating software projects?

Posted by Pete McBreen 27 Oct 2019 at 22:38

Looking back over Kyle Wilson’s Software Is Hard post from way back in 2007, I was reminded about how bad we are at estimating software projects. As the article hinted at, we can get reasonable estimates for software we have previously developed, but there is little value in redeveloping existing software.

You never have to solve the exact problem that someone’s solved before, because if software already existed that solved your need, you wouldn’t have to write it. Writing software is expensive. Copying software is cheap.

In case anyone thinks that this lesson from 12 years ago is no longer relevant, think of the multiple promises that have been made for autonomous cars in recent years and compare that to the performance of the recently released Smart Summon feature in Teslas. There are multiple videos out there of cars driving tentatively or erratically in a parking lot, and we think that soon these cars will be ready to drive unattended on a public highway?

On a historical note, Consumer Reports back in early 2016 reported on the self parking feature, and now in late 2019, Consumer Reports suggests that Tesla Owners are Beta Testers.

Elixir/Phoenix/Ecto, UTC in database, localtime on web page

Posted by Pete McBreen 02 Oct 2019 at 22:29

Turns out this is non-trivial, as Elixir ships with partial support for timezones, as documented by Lau Taarnskov, the creator of the Tzdata library.

First thing is to add the tzdata to mix deps

{:tzdata, "~> 1.0.1"}, 

Then have to set ecto to use utc_datetime (microseconds not needed for simple table updates)

@timestamps_opts [type: :utc_datetime] 

At this point, no matter what timezone you are in, an update will store the UTC time in the database, and show that UTC time on the web page, which although correct is not all that useful. Annoyingly querying for a timestamp in pgAdmin, will transparently shift any timestamp with time zone column to show the local time, but when displayed on the web page it will show the UTC time with a Z after the time as in “2019-10-02 21:58:15Z” rather than the local time of “2019-10-02 15:58:15”

The fix for this is to use DateTime.shift_zone/3 (shift_zone!/3 coming in Elixir 1.10), for now that code can live in the view

def format_timestamp(timestamp, time_zone) do
    |> shift_zone(time_zone)
    |> raw

defp shift_zone(timestamp, time_zone) do
   case DateTime.shift_zone( timestamp, time_zone) do 
     {:ok, dt} -> NaiveDateTime.to_string(DateTime.truncate(dt, :second)
     {:error, _dt} -> NaiveDateTime.to_string(DateTime.truncate(timestamp, :second)) <> " UTC"

And then in the template you can just use this to format the timestamp correctly for the appropriate timezone

  <%= format_timestamp(subject.updated_at, "MST7MDT") %>

My timezone is MST7MDT, but it will be better to pull it in from the user’s browser using some JavaScript and push it into the session, or have it as a configurable value on the user profile. Luckily all modern browsers now have a simple way to get the timezone from the browser…

const tz = Intl.DateTimeFormat().resolvedOptions().timeZone;

Please note that to be safe, the shift_zone function will return the UTC time if the timezone passed in is not recognized.

Elixir Ecto simple SQL Query

Posted by Pete McBreen 21 May 2019 at 06:00

Demo after the previous post, showing just executing SQL

Interactive Elixir (1.8.1) - press Ctrl+C to exit (type h() ENTER for help)
iex(1)> alias Ora.Repo
iex(2)> result = Ecto.Adapters.SQL.query!(Repo, "select * from scott.emp where empno = :1 ", [7369])

13:34:54.892 [debug] QUERY OK db=16.0ms
select * from scott.emp where empno = :1  [7369]
  columns: ["EMPNO", "ENAME", "JOB", "MGR", "HIREDATE", "SAL", "COMM", "DEPTNO"],
  num_rows: 1,
  rows: [
    [7369, "SMITH", "CLERK", 7902, ~N[1980-12-17 00:00:00], 800.0, nil, 20]

Note. Can also use Ecto.Adapters.SQL.query which will then return the usual tuple

iex(3)> result = Ecto.Adapters.SQL.query(Repo, "select * from scott.emp where empno = :1 ", [7369])

13:38:05.739 [debug] QUERY OK db=16.0ms
select * from scott.emp where empno = :1  [7369]
   columns: ["EMPNO", "ENAME", "JOB", "MGR", "HIREDATE", "SAL", "COMM",
   num_rows: 1,
   rows: [
     [7369, "SMITH", "CLERK", 7902, ~N[1980-12-17 00:00:00], 800.0, nil, 20]

Need to set MIX_ENV to dev/test/prod to switch between environments.