The day after tomorrow: Nautobot v.1 to v.2, adapting to official docker compose scheme, adding & upgrading plugins, adapting jobs and more house cleanup after migration to SD-Access.

Not done yet?

Yeah I am back. And so are you, it seems. The last post was about Enhancing network migrations towards an SDA Network with Nautobot, Nornir , Cisco ISE and Checkpoint Mgmt API (and your user Directory Pages) part 1, part 2 and part 3, this is after the SD-Access migration (hence the title).

The last user migration weekend was in the beginning of September. I didn’t have enough time to deal with all my technical debt before then. So it had been piling up for months and months, as everything was kept frozen while the SDA project was on going:

Nautobot version,
python packages versions
python code for automated tasks used in the project

Beautiful dawn! Your town?

Denver Skyline dawn 😉 , taken from https://www.pinterest.com/pin/denver-coloardo-downtown–528328600031600360/

No no, some friends are over there this week. I just wish I was there as well..

As soon as the SD-Access migration was done, I understood that I had to get moving immediately on defrosting those items, or face the risk that a big part of the documentation data stored in Nautobot would become irrelevant. I had also realized that staying too far behind the latest Nautobot version would eventually cause more problems, especially in instances based on standard virtual machines (e.g. Ubuntu) instead of docker & docker compose.

It was time to upgrade the production instance to Nautobot version 2.x. It had been so for a while. The version I was using during the last parts of the SDA migration project was v.1.6.7. The change would impact the existing code, which was written using pynautobot, Nornir, the Nornir-Nautobot inventory python package and a lot of library code our team had created to help with the migration (and other work items). I would have to do a lot of adjustments there as well.

What was the plan?

The logic was simple: get everything to the current status and get rid of the documentation drift “caused” by the SDA migration project, as soon as possible:

get all the current network data from the real network into Nautobot
get rid of the old network data present in Nautobot, that are no longer valid for the real network.

The upgrade to v.2.x initially seemed more or less straight forward. It was all about checking the package versions (without forgetting the dependencies on the python version if that applies for your instance), looking for breaking changes and incompatibilities as usual, but in this case also reading carefully what a major version upgrade would bring.

Network to Code has put up some documentation for it here: https://docs.nautobot.com/projects/core/en/stable/user-guide/administration/upgrading/from-v1/upgrading-from-nautobot-v1/. I had prepared the locations according to that and made sure I didn’t have any problems described in the doc, or I thought I did. I also run the pre-migrate directive to make sure:

nautobot-server pre_migrate

I have to give this to NTC. They do tell you everything (or very near to everything). You just don’t realize it at the moment. For example I didn’t pay attention to the Role changes. It did came to bite me later on. Same with the warnings about container and network prefix types, etc. It’s about not having enough time, stamina and focus to go as far as you should, when preparing an upgrade. But the pre-migrate check had passed so..

You pushed the button?

Yes.. I first went as far as performing an upgrade in a lab instance which went rather well, but did not immediately move forward with the production server used in the migration with all that network automation code involved in the project, in risk of breaking down.

When the time finally came to upgrade the production server , I was surprised to find out that the docker compose based nautobot setup was not also frozen as my production server (yes the world turns without our help or permission). Ιt had finally gone through a total makeover, through the amazing work of Mr Justin Drew (@jdonga in the Nautobot channel in the NTC Slack). So I had a lot more extra reading and work to do there to adjust it for our use case and cover all our specific needs.

There were more things to do. There was a need to getting data updated more quickly in Nautobot to reflect all the changes that had occurred during the migration.

Taking care of old and decommissioning network devices related changes (ip addresses/prefixes/interfaces etc) was one thing. I was already doing that mostly in waves during the migration. On-boarding the new devices was another. I had a mind to follow what we had done the last time, perhaps using the current dedicated vendor tool that already had the information in a single place: Cisco DNA Center. But that would mean creating the necessary scripts all over again, either using REST API or the DNAC python sdk, or finding another way of syncing to DNAC. That SSoT Nautobot plugin looked and sounded like a nice choice for this, but the version supporting DNAC wasn’t yet publicly available. So in that regard, the new Device Onboarding plugin, using the SSoT plugin, seemed like a practical choice, in order to speed up things.

The final part of the work was about adjusting the jobs we had already created to the new architecture for Nautobot v.2.x, as I knew that as soon as we would upgrade, the old jobs would become unusable. I had read as much from posts in the Nautobot slack and had seen there was some documentation lying around (it’s also included in that giant post about the upgrade to v.2.x, here) but I had not yet gone deeper into it.

So I knew what I had to do, I just had to start doing it. I figured the first thing to do was upgrade the prod instance to version 2. Using the same scheme I was already using was possible for version 2, I had already done that while testing, it’s published in the previous post, here (github repo here – heads up: I will update this to current version using the official docker compose method as soon as time allows).

I gave that a try. I wasn’t satisfied with the performance. Also, while checking out start up warnings in the logs and small details, I figured that I probably should migrate to a scheme that NTC published about this, as asking for support for your own thing instead of a community published reference, is probably not a good idea, you might as well save some time for the people trying to help you, by spending some time yourself in the right direction.

I took another look at the nautobot docker compose repo: https://github.com/nautobot/nautobot-docker-compose , started reading what was documented there and attempted to figure out the structure and way of operation. I got a bit confused by certain things. So I turned to the Nautobot channel in the NTC slack and asked who was able to answer a few questions about this. Justin showed up soon after and we started discussing, eliminating my questions one by one. I started installing the new schema on a lab instance (there are instructions in readme.md).

What are the first steps?

Suppose you were following along, what would you do? Clone the repo of course!

sudo git clone https://github.com/nautobot/nautobot-docker-compose

That command will create a clone of the remote repo under the current directory. For example if you run that in /opt, the new dir would be /opt/nautobot-docker-compose/. In the repo readme.md, Justin refers to the installation of components such as docker and docker compose as a docker plugin, by referencing the docker documentation. I would suggest using these posts by Digital Ocean:

You can choose a different os version on those pages, if that is necessary for your case. You might also want to consider which python version you will be using before hand, since the support for python 3.12 is slowly introduced in the Nautobot ecosystem (the Nautobot package supports it, and many plugins but not all). Remember, different Ubuntu versions will get you different python versions by default.

24.04.1 LTS will probably get you python 3.12
22.04.5 LTS will probably get you python 3.11

In the lab I went with 24.04 and soon after had to install python 3.11 to use for installing packages and poetry. Docker, Docker Compose and Poetry should get you to the starting point.

What’s the deal with Poetry? And what the h&*(% is Invoke?

Now calm down.. I realize you are a network engineer (OG) and you are already upset enough having to deal with Python, IDEs, git, and the whole NetDevOps deal (not to mention administering all the systems you are running for your lab infrastructure). Having also to deal with Poetry for this installation may make you feel uneasy, I understand that..

I asked around (and although I haven’t asked Justin himself about that in particular), I believe the choice was made because poetry is more versatile for managing dependencies and dealing with packages than pip. I have to say I agree and that I saw that as an opportunity to learn. There is also some reference for all this in the github repo readme.md, so I am guessing Justin agrees too.

(Video for learning about Poetry in 8 mins – you don’t seriously expect that to be true, right? RIGHT? The trainer is descent at least)

It’s not that hard to find more info on Poetry. Here are a few links:

https://python-poetry.org/ – Official Website
https://python-poetry.org/docs/ – Official Documentation, contains installation procedure. I used pipx:
- https://pipx.pypa.io/stable/installation/ – Official Site
- https://github.com/pypa/pipx – Github repo, contains instructions.
https://www.packetcoders.io/an-intro-to-python-poetry/ – Packet Codes short blog post. They have also a recorded session about Poetry in the member’s area (needs subscription, well worth it).

Once you have installed Poetry, Docker, Docker compose and clone the nautobot docker compose you ‘ll be ready to go!

That’s all ? Wait.. Are you pulling my leg?

Yes! It’s never that simple, there’s always more, you should know that by now. But before I add more detail for everything involved, let me try to help you grasp a very subtle notion no one has (yet) written down about this installation scheme/environment.

Why do you suppose we are so concerned with creating a python based virtual environment layer in the docker host with Poetry, while our Nautobot infrastructure will “live” in containers which can have whatever python version they like? Why tie a part of the host software packages and a virtual environment on the host to a containerized infrastructure, which is supposed to be chosen among other things to keep the two worlds separate, the docker host from the docker containers?

Are you saying that was a mistake?

It’s true that it initially seems a bit strange as a choice. But it does serve a greater purpose, you can easily use that environment for your development for code and jobs in the host, while running Nautobot in the containers, as the virtual environment will have the same packages as in the containers, and it will be used… (drum roll) … to create the Nautobot image.

You forgot didn’t you? Just like in the other scheme, it’s the same in this one: we need to build a Nautobot image. In our previous scheme we had to do this to make sure Nautobot supported ldap, and ssl for https connections towards job repos (e.g. gitlab), etc. That step is included here as well, and Poetry will help prepare things.

And invoke?

Invoke.. I haven’t mentioned it yet. Yes, invoke is used in this scheme to control the infrastructure instead of docker compose and it encompasses more functions than you could with docker compose, in “a more pythonic way” (that’s what they say, I don’t know, camels and snakes, what do you expect from those people?)

I guess that means you can write the tasks/processes in python which makes it more versatile, as you can see later on, since we are adding to the existing functionality very easily.

So what is invoke? Well (as it says on the website), Invoke is a Python (2.7 and 3.4+) library for managing shell-oriented subprocesses and organizing executable Python code into CLI-invokable tasks.

Basically you use invoke task and a decorated function inside the tasks.py file in the current dir is called to handle your task. So instead of docker compose up -d we write invoke start, etc. Those options for this project are all detailed in the repo readme.md. Here are two links for reading more on invoke:

https://www.pyinvoke.org/ – Project website
https://github.com/pyinvoke/invoke – Project github repo

So do we need to install invoke too? You didn’t include that in your list.

No, invoke is installed with the rest of the python packages after creating the poetry virtual environment for Nautobot and executing the poetry install command (just remember to add --no-root after poetry install: poetry install --no-root, if not it’s going to try to build a new package, yeah poetry can do that too..)

So make sure you have the correct python version installed (I used python 3.11), and then you run

poetry env use python3.11
poetry shell

to enter the virtual environment. Once you do that you can start adding packages to the pyproject.toml file by using the following commands:

poetry add nautobot=2.3.11 (I used 2.3.2 for my first experiment)
poetry add nautobot-device-onboarding=4.0.0
etc.

How do you know which numbers to use for the packages?

Ffs man, do a little research on your own! Go to each Nautobot app repo page on Github and search for the latest releases. Start here: https://github.com/orgs/nautobot/repositories?type=all to find them all. However, be mindful. Some will not play well with others until some dependency issues are resolved. In my case this was true about the apps SSoT and Device Onboarding. I had to stick to v. 4.0.0 and 2.8.0 for Nautobto 2.3.2 but you can use 4.1.0 and SSoT 3.0.0 or 3.0.1 (I will make sure and let you know). I guess those two projects have some road ahead to become fully compatible as Device Onboarding current version is 4.1.0 and SSoT’s current version is 3.2.0. I will have a pyproject.toml file.

I see you mumbling over there… The best way to keep up with things like that is the Nautobot channel in the NTC slack. You can ask your questions there. Head here and follow instructions: https://slack.networktocode.com/

Ok.. do I run invoke start?

E hm… no. We haven’t talked about folder structure, what you need to change, how do you handle ldap, and how you can add nginx infront of Nautobot in order to provide https access. Let’s take care of those. Call out each item and I will give you the answers. I also have a short story about those performance issues, I will tell you in the changes section.

Files and Directories – Root folder

There are some important files in the root folder.

invoke.yml

This is the file that defines what will be called from the invoke start task and contains options about the docker compose yml directory (./environments/) where the files containing the options for the various services will be placed and a reference for each file. There are examples so you can build your own. LDAP is often referenced as there are requirements for its support. Here is what mine looks like, I have added the docker compose file about nginx:

---
nautobot_docker_compose:
  project_name: "nautobot-docker-compose"
  python_ver: "3.11"
  local: false
  compose_dir: "environments/"
  compose_files:
    - "docker-compose.postgres.yml"
    - "docker-compose.ldap.yml"
    - "docker-compose.local.yml"
    - "docker-compose.nginx.yml"

pyproject.toml and poetry.lock

These are the files containing poetry settings, the first containing the list of the packages and their versions and the second one produced and updated dynamically via poetry commands. Here is what my pyproject.toml looks liken at the time of the change with Nautobot version 2.3.2 (now the current version is 2.3.11):

[tool.poetry]
name = "nautobot-docker-compose"
version = "0.1.1"
description = ""
authors = ["John Doe <johndoe@mail.com>"]

[tool.poetry.dependencies]
python = ">=3.9,<3.12"
nautobot = "2.3.2"

# nautobot-example-plugin = {path = "plugins/plugin_example", develop = true}
nautobot-device-onboarding = "4.0.0"
nautobot-ssot = "2.8.0"
nautobot-golden-config = "2.1.2"
nautobot-plugin-nornir = "2.1.0"
nautobot-device-lifecycle-mgmt = "2.2.0"
nautobot-bgp-models = "2.2.0"
nautobot-floor-plan = "2.3.0"
nornir = "^3.4.1"
nornir-utils = "^0.2.0"
nornir-netmiko = "^1.0.1"
nornir-nautobot = "^3.2.0"
nornir-pyxl = "^1.0.1"
nornir-netconf = "^2.0.0"
nornir-inspect = "^1.0.3"
xmltodict = "^0.13.0"
ipdb = "^0.13.13"


[tool.poetry.dev-dependencies]
invoke = "*"
toml = "*"

[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"

You may notice a couple of things, like the versions for the packages and the one put in comments. About the versions, they are automatically added when you run those add commands inside the poetry shell. So let poetry do this for you. You only need the ones you want installed, that’s just the ones I use.

The nautobot-example-plugin needs to be put in comments because if you don’t do it, you will have problems when building the image (among other things). Remember this repo is also purposed to help you build things (like nautobot plugins) and turn them into python packages (one are where Poetry helps a lot). But that’s not what we are doing. We are building Nautobot for production, so remember that because we are coming back to it.

tasks.py – adding some more tasks

This file contains the different decorated functions that handle all the tasks you would normally run with docker compose plus a couple more like backup and restore (oh yeah… but be careful to read the instructions on the github repo on that, it might not work as you expect it to if you don’t).

It’s too big to put here but here ‘s an example of one of the functions I added to augment the functionality ( I am supposed to put those in a PR, I will soon after this).

@task
def logs(context):
    """Show logs for all Nautobot services defined in the docker-compose files."""
    print("Showing Nautobot Logs")
    docker_compose(context, "logs --follow --tail 500")

@task
def softstart(context):
    """Start Nautobot containers when previously stopped (not deleted)."""
    print("Starting Nautobot back up...")
    docker_compose(context, "start")

@task
def softstop(context):
    """Stop Nautobot containers without deleting them."""
    print("Stopping Nautobot Containers...")
    docker_compose(context, "stop")

Files and Directories – sub folders

I will mention just two of them, environments and config.

In environment you will find eveything particular to your environment needed to run the nautobot instance:

credentials file
environment variables
docker compose files (all those defined in invoke.yml
the Dockerfile for creating the image (I am using the LDAP one).

In the config directory you will find the nautobot_config.py file and a version for ldap. You will notice that in those files, in contrary to what you may be used to, all default options that are usually in comments, are not there anymore. You just put in what you need. So you may ask where do I find the current default options? You can do one of the following:

create the nautobot version you want or upgrade the one you have to that without the config file mounts and then copy the config out of the container. I guess it sounds stupid and you do need to create a vm snapshot to go back to because you are really risking of messing up your database if the upgrade modifies it and you then go back to the older version without reverting to the snapshot.
use the method described in the doc, perhaps after you upgrade using the same config hoping it will work and again copy the config out of the container (or just redirect the output to a file on the host, you figure it out). Come to think of it, you still have the same problem about messing up your db if you don’t use a snapshot..
do the whole thing in another system or just use a new dir and rename the bloody named volume mount where your db is created. Then do which ever you prefer of the ones above.

So basically don’t forget to get a snapshot, use a different volume name or backup and then restore your db. And good luck with those.

About ldap config, you need the same things I posted in the previous post (where the new default config thing is also mentioned (I search on google and there it was, my own post, 7th link in row.. should I feel stupid or smart? I don’t know).

What you need to change

Well.. a lot. Start by creating your own invoke.yml, since it does not exist and you have to create one based on the examples. If you cloned the ntc repo you may want to add a few entries at the end of the .gitignore file with your own additions so they won’t show up as changes for the git repo.

You also need to change things in your docker compose files, such as the name of your company/group in the name of the image, possibly the image version for postrgress (I am using 14.5), possibly some new volume mounts that you need (I do).

The rest of the changes are like last time, adding your plugins in the config file, changes in the .env files to suit your case, etc.

Add Nginx as reverse proxy with https

For putting nginx in front of nautobot, you have to add the usual stuff like certificates, an nginx conf file that has to be mounted and that docker-compose file for nginx, as it was described again in the previous posts about how to setup nautobot (including how to create the self-signed certificates if you need to). Here is what the nginx docker compose file looks like in my case:

---
services:
  nginx:
    image: nginx:1.21.1-alpine
    depends_on:
      - "nautobot"
    ports:
      - "443:443"
    volumes:
      - /etc/timezone:/etc/timezone:ro
      - /etc/localtime:/etc/localtime:ro    
      - ../nginx-default.conf:/etc/nginx/conf.d/default.conf:ro
      - ../ssl/server-name.crt:/etc/ssl/certs/nginx.crt:ro
      - ../ssl/server-name.key:/etc/ssl/private/nginx.key.pem:ro

The nginx conf is the same as in the previous post.

Dockerfile-LDAP

There were a lot more to change in Dockerfile-LDAP, including adding in the certificates because at some point you need to communicate to a git server (like gitlab) and if that server is using the same enterprise ca or uses also self-signed certs, then you need those to be in the container. However the process to create the image this time, along with the performance issues, led to a discovery I was not expecting, but one that I should have noticed on my own, like the one here:

CMD ["nautobot-server", "runserver", "0.0.0.0:8080", "--insecure"]

If you have ever installed nautobot on a vm using the guide yoy will remember the step where you launch the test server to make sure everything is working, but then it’s explained how this is not suitable for production, so uwsgi is introduced and nginx is put in front as a webserver. So in this version of the Dockerfile-LDAP it should be pretty clear that the test/dev server is used. I will get back to that in a second.

I also run into problems building the image as I noticed that the procedure was failing while trying to copy the code for those plugins I was supposed to be developing inside the image. But since I wasn’t, there was nothing to copy so the building of the image failed. Once I figured that out, I removed the example plugin from the pyproject.toml file and commented out everything related to that source code and plugins.

Development vs Production

In those exchanges we had with Justin, he mentioned the word development a few times so I felt the need to remind I was building this for production. So he said, “Oh if you are building this for production then you probably don’t want to run the test/dev server for this”. A couple of minutes later I understood I was supposed to be running uwsgi instead, so I had to change the command to run the server (look in the docker-compose-local.yml file) and provide a uwsgi.ini file to mount from the project root folder in the docker host. I left the command for the test server alone in the build stage in Dockerfile-LDAP as it didn’t seem to matter that much for building the image.

Once that was done performance was significantly improved. I am not sure what was I running all this time in docker, as there was no visible reference in the dockerfile or some kind of override in the docker compose (so it depends on what was previewed inside the original nautobot image). From what I was told be people who know more than I do, there are examples of uses in the field where the dev server is fine provided there is enough awareness to know when your sizing is enough and to be able to scale out using any means necessary or available (such as load balancers, etc).

Are those changes/updates available anywhere?

They will soon be under a new github repository that I will link here (and update the post). It’s a few more hours work. So come back in a couple of days.

What about jobs changes?

Well it’s essentially what is described in the documentation. If you read up on that, it points you to another separate article here: https://docs.nautobot.com/projects/core/en/stable/development/jobs/migration/from-v1/ . You can get a good grip about the changes from the examples, for example data is no longer used as a parameter and you will most likely need to register jobs so that they become available in the nautobot ui.

As I was pressed for time I enlisted chatgpt on that to help me analyze what the differences are from v1 to v2 and what I should do to change my code. As usual, it got a big percentage of things correct and messed up a bit in some cases, so it made me spent a bit more time separating the good stuff from the bad and correcting the AI. Among the first things we established to be different were the following:

sys.exit(1) replaced with return to avoid abrupt job termination.
More robust exception handling in API requests.
Modularized job structure for flexibility and better error reporting.

I should probably do a different post about this as there is so much more to it. Perhaps the NTC blog posts have done a better job at presenting jobs than I have.

I managed to convert one job that checks for doc drift between Nautobot and Cisco Prime Infrastructure and I have so much more I need to work on. I will let you know.

And the automation code that needed changing?

I still haven’t totally opened that can of worms.. I did some debugging at the time of the migrations when I span up the lab in v.2.0, but is a lot more to do, with the mac address collection code first and every call to nautobot to follow. That’s also material for another post, not before January 2025.

Ready to run?

Yes, now you can follow the instructions on the github repo:

poetry shell
add your packages
poetry lock
poetry install –no-root
invoke build
invoke start

If you added my tasks as well, you may as well run invoke logs , as well. Remember, unless you are in the Poetry virtual environment, invoke does not work. You can type exit to get out of it if you need to.

So how did the “house clean up” go?

It’s still going. Using the new Device Onboarding plugin, things have been going a lot faster with integrating the new SDA network devices in the database. As more people would start to use Nautobot, I discovered more changes, such as that role had moved to extras so its permissions were missing from my database (which I am carrying since the Netbox days, between migrations and upgrades). There were more things to correct with permissions to allow for non superusers to modify objects such ip addresses, devices, racks, etc.

Updating the ways I do backup/restore/sync was also another task, as the new scheme allows for backup and restore through invoke, but before you restore it assumes you have used invoke stop which essentially is the equivalent of a docker compose down. My previous scripts were assuming everything is running and were independent of what the instance is (well sort of.. ). I am still using those in certain cases for example to sync to my vm based instances, but I modified the scripts a bit and added a transport script that would carry over the backup file produced by invoke db-export to the remote servers. The next step is to create a different script to provide backup to a remote storage volume and do housekeeping for previous backups to be deleted after a predefined period of time.

Enough for now

Poetry, python based tasks, changes for modularity, a bit too much for you ? Can’t you see it’s a step in the right direction? It’s meant to help you, you know. You do need to move forward, just as everyone else does.. Just take a look at what’s been happening around you.. Autocon 2 is now ongoing in Denver, the impact of automation and programmability is starting to be significant and tools like this are becoming part of an ecosystem, coupled and integrated with more tools dedicated to specific areas of interest such as network discovery and more. Having a distribution platform for Network Automation has never been more important. I was part of a reunion last night for our NetAutoGr group and the discussions where pure networking and python were intermixed were amazing.. I really believe things are starting to change. Just your eyes pealed and your ear to the ground..

Cu in the next post. Remember to come back for the updates! Get out there and automate!