How it started out
When installing software on my new MacBook Pro M2, I started out with a fresh install instead of restoring from a back-up. This worked out great until I tried to follow a tutorial written by a colleague which used the Azure Python SDK to create a dataset and upload it to an Azure storage account.
The error was not too descriptive and quite generic:
ModuleNotFoundError: No module named 'azureml.dataprep'
Let’s try to fix this
Some searching on the internet suggested I needed to install the azureml-dataset-runtime
dependency. I assumed it was already done since it was referenced in the pyproject.toml file but naively went ahead and tried to install the library.
pip install azureml-dataset-runtime==1.40.0
Failed to build numpy
ERROR: Could not build wheels for numpy, which is required to install pyproject.toml-based projects
[end of output]
I did some more searching and found a known issue with the Azure ML Python SDK on M1 or M2 MacBooks: Azure Python SDK issue
The suggestion was to use Rosetta to run Python on a Mac. This sounded a bit too harsh for me so I searched a bit more and stumbled upon this post: How to create a separate Rosetta Terminal which showed how to create a separate Terminal
application which used Rosetta. Unfortunately this did not work on the M2 MacBook (Ventura), as stated in the comments further down the thread. The discussion in this thread gave me some direction to further search for a solution though.
The story continues
So I needed a thing called Rosetta. It’s an emulator used to run applications compiled exclusively for Intel processors for execution on Apple silicon. And yes that seemed like something I needed. But not for all my software right? And also not for all my Python projects. I needed to do a bit more research on the topic and then this stackoverflow post crossed my eyes.
Extra searching about the arch
command lead me to this final command to start a Terminal session with the x86_64
architecture:
arch -x86_64 /bin/zsh --login
for this to work Rosetta needs to be installed. IF it is not, you can use this command to install it:
softwareupdate --install-rosetta --agree-to-license
The arch -x86_64 /bin/zsh --login
is a nice peace of commandline but not that easy to remember so I created a helper function around it in my .zshrc
# Rosetta start alternative session
switch_rosetta () {
arch -x86_64 /bin/zsh --login
}
So is that it?
Well not really. We are only halfway. For the Python code to work correctly we need more than only a separate session. The software used also needs to be architecture specific. So I needed to install a separate Homebrew to use when installing software for the Rosetta environment.
switch_rosetta
# install Homebrew
arch -x86_64 /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install.sh)"
Fortunately the Rosetta version of Homebrew is installed in a different location (/usr/local/Cellar
and /usr/local/bin
) than the already existing Homebrew installation (/opt/homebrew
and /opt/homebrew/bin
)
This way we can install the needed software for our project separately.
brew install [email protected]
brew install azure-cli
brew install poetry
etc...
Finalising the setup
It would be nice to make the terminal session aware of the separate Homebrew settings etc. So again I created some helper functions in my .zshrc
# Rosetta alternative sessions
archcheck () {
if [ "$(uname -p)" = "i386" ]
then
echo "Running in i386 mode (Rosetta)"
eval "$(/usr/local/Homebrew/bin/brew shellenv)"
# Remove /opt/homebrew/bin and /opt/homebrew/sbin from path to not interfere with /usr/local/bin
path=( ${path[@]:#/opt/homebrew*} )
# separate pypoetry cache directory to separate the different poetry virtual envs
# Cache directory needs to be created: mkdir -p ${HOME}/Library/Caches/pypoetry-i386
export POETRY_CACHE_DIR=${HOME}/Library/Caches/pypoetry-i386
alias brew='/usr/local/Homebrew/bin/brew'
elif [ "$(uname -p)" = "arm" ]
then
echo "Running in ARM mode (M2)"
eval "$(/opt/homebrew/bin/brew shellenv)"
# Remove /usr/local/bin from path to not interfere with /opt/homebrew
path=( ${path[@]:#/usr/local/bin} )
export POETRY_CACHE_DIR=${HOME}/Library/Caches/pypoetry
alias brew='/opt/homebrew/bin/brew'
else
echo "Unknown architecture detected"
fi
}
eval "archcheck"
That’s it
Now I can easily switch between Terminal sessions and most setup is done in the archcheck function automatically. It is not 100% fool proof. For example docker
commands stopped working. In the archcheck I remove /usr/local/bin
from the path to not mix both Homebrew installed software packages, but also Docker Desktop uses that location to install the symbolic links to the docker commands. I fixed this by copying those symbolic links also to /opt/homebrew/bin
. I expect to run into more of these small quirks, but am happy with the current solution at least.
I believe this is a very good starting point which I hope will benefit others who are developing for AzureML.
Cleanup
Once the time comes that all software is Apple sillicon compatible you might want to clean up the separate Homebrew installation.
This post about uninstalling Homebrew can act as a starting point. Make sure you do this whilst in the Rosetta Terminal session. After this cleanup the .zshrc
file can also be cleaned up.