Week 1 · Coding Period

Building the AGL image, and proving an AGL VM can subscribe

The goal for the week: compile a working AGL+ROS2 image, stand up rosbridge, and prove — with fake data first — that an AGL virtual machine can receive ROS2 topics over QEMU's networking.

1. Building AGL + ROS2 — and a recipe bug

With the tree configured, the build is one command (run inside tmux so it survives disconnects):

cd ~/AGL/master/build-qemu-ros
source ../external/poky/oe-init-build-env .
bitbake agl-image-minimal

Roughly 7,500 tasks in, it died:

The bug
ERROR: babeltrace-1.5.11-r0 do_compile: oe_runmake failed
| make: *** No targets specified and no makefile found.  Stop.

“No makefile found” means the configure step that generates the Makefile produced nothing. The configure log confirmed it, which is bizarre for an autotools project that clearly has a configure.ac:

NOTE: nothing to configure

To find out why, I asked BitBake to print the variables it computed for this recipe:

bitbake -e babeltrace | grep -E "^(S|CONFIGURE_SCRIPT)="
S=".../babeltrace/1.5.11/babeltrace-1.5.11"   # ← wrong folder

And on disk:

ls .../1.5.11/babeltrace-1.5.11/configure.ac   # No such file or directory
ls .../1.5.11/git/configure.ac                # exists ✓

There it is. The recipe fetches source with git://, which BitBake unpacks into a folder called git/ — but the recipe never told BitBake that, so the source variable S kept its default (an empty babeltrace-1.5.11/). BitBake looked there, found no configure.ac, shrugged (“nothing to configure”), and left no Makefile.

The fix — one line
echo 'S = "${WORKDIR}/git"' >> \
  ~/AGL/master/external/meta-ros/meta-ros2/recipes-kernel/lttng/babeltrace_1.5.11.bb

# clean the broken state and confirm just this recipe builds
bitbake -c cleansstate babeltrace && bitbake babeltrace
# → all tasks succeeded

Then the full image build ran to completion:

NOTE: Tasks Summary: Attempted 11050 tasks of which 7510 didn't need to be rerun and all succeeded.

This is a genuine upstream defect — a git recipe missing its S override. It's a one-line fix that's a strong candidate for a pull request to meta-ros. (A handful of other small layer issues surfaced during configuration too — a missing layer dependency and a packagegroup syntax slip — each fixable upstream.)

2. Checking the image is fit for purpose

Before relying on it, I verified the image contained what the test client would need. Yocto splits Python into dozens of sub-packages, so I specifically checked for python3-json — not just python3-core:

grep -E "^(python3-core|python3-json|busybox|openssh) " ...rootfs.manifest
python3-core   3.12.12
python3-json   3.12.12
busybox        1.36.1     # provides wget
openssh        9.6p1

3. Standing up rosbridge

rosbridge installs into the ros2 conda environment:

conda activate ros2
mamba install -y -c robostack-staging -c conda-forge ros-humble-rosbridge-suite
ros2 launch rosbridge_server rosbridge_websocket_launch.xml   # binds 0.0.0.0:9090

Two quirks  rosbridge takes ~10 seconds to actually bind the port on first launch — connect too early and you get “connection refused.” And it ignores SIGTERM; to stop it you need kill -9.

4. A WebSocket client in pure standard-library Python

The minimal AGL image has python3 but no WebSocket library and no easy way to install one into the running VM. So the client had to use only the standard library — implementing the WebSocket handshake and the rosbridge subscribe protocol by hand. The protocol itself is simple: after the handshake, subscribing is a single line of JSON.

# subscribe is one JSON message over the socket
ws_send_text(s, json.dumps({"op": "subscribe", "topic": "/carla/odom"}))

# then read frames; each "publish" carries the ROS message as JSON
while True:
    d = json.loads(ws_recv_text(s))
    if d.get("op") == "publish":
        pos = d["msg"]["pose"]["pose"]["position"]
        print(f"x={pos['x']:.2f} y={pos['y']:.2f} z={pos['z']:.2f}")

Two traps learned the hard way  (1) Client-to-server WebSocket frames must be masked (XOR'd with a random key) or the server rejects them. (2) When Python's stdout goes through a pipe rather than a terminal it block-buffers, so a killed process loses its output — run with python3 -u when piping.

5. Proving the rosbridge link with fake data

Rather than bring up the heavy CARLA simulator immediately, I isolated the new risk — can an AGL VM reach the host's rosbridge? — by feeding rosbridge a fake topic and connecting from the VM. First, boot AGL with QEMU's user-mode networking:

runqemu qemux86-64 slirp nographic     # login: root (no password)

The key fact that makes this work: in QEMU's slirp networking, the host is always reachable from inside the guest at the address 10.0.2.2 — no port-forwarding, no setup. So the host's rosbridge on :9090 is reachable from the VM at ws://10.0.2.2:9090.

Get the client into the VM by serving it over HTTP from the host and fetching it — which also proves VM→host networking works:

# host: a fake /carla/odom at 5 Hz, plus a file server
ros2 topic pub -r 5 /carla/odom nav_msgs/msg/Odometry "{pose: {pose: {position: {x: 7.5, y: 8.25, z: 1.0}}}}"
python -m http.server 8000

# inside AGL
wget -O /tmp/agl_ws_test.py http://10.0.2.2:8000/agl_ws_test.py
python3 -u /tmp/agl_ws_test.py 10.0.2.2 9090 /carla/odom
[agl] websocket handshake OK, subscribing...
[agl] #1 /carla/odom  x=7.50 y=8.25 z=1.00
[agl] #2 /carla/odom  x=7.50 y=8.25 z=1.00
...

Week 1 result  A from-scratch AGL image builds and boots, rosbridge is running, and an AGL virtual machine successfully subscribes to a ROS2 topic over WebSocket. The only remaining unknown is the data source — which becomes Week 2's job: swap the fake publisher for the real CARLA simulator.

← Community Bonding
Week 2: Live CARLA → AGL →