Blog

Running unsupported Azure Python SDK on my brand new M2 Mac

09 Jun, 2023
Xebia Background Header Wave

How it started out

When installing software on my new MacBook Pro M2, I started out with a fresh install instead of restoring from a back-up. This worked out great until I tried to follow a tutorial written by a colleague which used the Azure Python SDK to create a dataset and upload it to an Azure storage account.

The error was not too descriptive and quite generic:

ModuleNotFoundError: No module named 'azureml.dataprep'

Let’s try to fix this

Some searching on the internet suggested I needed to install the azureml-dataset-runtime dependency. I assumed it was already done since it was referenced in the pyproject.toml file but naively went ahead and tried to install the library.

pip install azureml-dataset-runtime==1.40.0
Failed to build numpy
ERROR: Could not build wheels for numpy, which is required to install pyproject.toml-based projects
[end of output]

I did some more searching and found a known issue with the Azure ML Python SDK on M1 or M2 MacBooks: Azure Python SDK issue

The suggestion was to use Rosetta to run Python on a Mac. This sounded a bit too harsh for me so I searched a bit more and stumbled upon this post: How to create a separate Rosetta Terminal which showed how to create a separate Terminal application which used Rosetta. Unfortunately this did not work on the M2 MacBook (Ventura), as stated in the comments further down the thread. The discussion in this thread gave me some direction to further search for a solution though.

The story continues

So I needed a thing called Rosetta. It’s an emulator used to run applications compiled exclusively for Intel processors for execution on Apple silicon. And yes that seemed like something I needed. But not for all my software right? And also not for all my Python projects. I needed to do a bit more research on the topic and then this stackoverflow post crossed my eyes.

Extra searching about the arch command lead me to this final command to start a Terminal session with the x86_64 architecture:

arch -x86_64 /bin/zsh --login

for this to work Rosetta needs to be installed. IF it is not, you can use this command to install it:

softwareupdate --install-rosetta --agree-to-license

The arch -x86_64 /bin/zsh --login is a nice peace of commandline but not that easy to remember so I created a helper function around it in my .zshrc

# Rosetta start alternative session
switch_rosetta () {
    arch -x86_64 /bin/zsh --login
}

So is that it?

Well not really. We are only halfway. For the Python code to work correctly we need more than only a separate session. The software used also needs to be architecture specific. So I needed to install a separate Homebrew to use when installing software for the Rosetta environment.

switch_rosetta

# install Homebrew
arch -x86_64 /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install.sh)"

Fortunately the Rosetta version of Homebrew is installed in a different location (/usr/local/Cellar and /usr/local/bin) than the already existing Homebrew installation (/opt/homebrew and /opt/homebrew/bin)

This way we can install the needed software for our project separately.

brew install [email protected]
brew install azure-cli
brew install poetry
etc...

Finalising the setup

It would be nice to make the terminal session aware of the separate Homebrew settings etc. So again I created some helper functions in my .zshrc

# Rosetta alternative sessions
archcheck () {
   if [ "$(uname -p)" = "i386" ]
   then
     echo "Running in i386 mode (Rosetta)"
     eval "$(/usr/local/Homebrew/bin/brew shellenv)"
     # Remove /opt/homebrew/bin and /opt/homebrew/sbin from path to not interfere with /usr/local/bin
     path=( ${path[@]:#/opt/homebrew*} )
     # separate pypoetry cache directory to separate the different poetry virtual envs
     # Cache directory needs to be created: mkdir -p ${HOME}/Library/Caches/pypoetry-i386
     export POETRY_CACHE_DIR=${HOME}/Library/Caches/pypoetry-i386
     alias brew='/usr/local/Homebrew/bin/brew'
   elif [ "$(uname -p)" = "arm" ]
   then
     echo "Running in ARM mode (M2)"
     eval "$(/opt/homebrew/bin/brew shellenv)"
     # Remove /usr/local/bin from path to not interfere with /opt/homebrew
     path=( ${path[@]:#/usr/local/bin} )
     export POETRY_CACHE_DIR=${HOME}/Library/Caches/pypoetry
     alias brew='/opt/homebrew/bin/brew'
   else
     echo "Unknown architecture detected"
   fi
}
eval "archcheck"

That’s it

Now I can easily switch between Terminal sessions and most setup is done in the archcheck function automatically. It is not 100% fool proof. For example docker commands stopped working. In the archcheck I remove /usr/local/bin from the path to not mix both Homebrew installed software packages, but also Docker Desktop uses that location to install the symbolic links to the docker commands. I fixed this by copying those symbolic links also to /opt/homebrew/bin. I expect to run into more of these small quirks, but am happy with the current solution at least.

I believe this is a very good starting point which I hope will benefit others who are developing for AzureML.

Cleanup

Once the time comes that all software is Apple sillicon compatible you might want to clean up the separate Homebrew installation.
This post about uninstalling Homebrew can act as a starting point. Make sure you do this whilst in the Rosetta Terminal session. After this cleanup the .zshrc file can also be cleaned up.

Questions?

Get in touch with us to learn more about the subject and related solutions

Explore related posts