Northline Robotics — ResearchNLR · Field Notes · 00:00:00Z

Research / Localization

Localization & State Estimation

Localizing a robot when GPS won't cooperate

Under a tree canopy or beside a building, satellite positioning quietly falls apart. Here is how an autonomous mower keeps a centimetre-grade fix anyway — and why the same math is what lets a drone fly where GPS can't reach.

Ask a robotic mower to do one apparently simple thing — cut a lawn in clean, slightly overlapping passes without wandering into the flowerbed — and you immediately collide with one of the oldest problems in robotics: the machine does not know where it is. A consumer GPS receiver is accurate to a few metres on a good day. A few metres is the entire width of many lawns. And the lawn is precisely the place GPS is weakest: under trees, beside walls, in the radio shadow of a house.

This article works through how a ground robot maintains an accurate pose estimate when satellite positioning is unreliable or absent. We will build up from the probabilistic foundations to a working sensor-fusion stack, with the equations and pseudocode you would actually implement. Everything here is domain-agnostic: it is the same machinery that lets a drone hold position indoors, and the same machinery NLR runs to keep a platform on-line in a whiteout.

01Why GPS fails on a lawn

Global Navigation Satellite Systems (GNSS — the umbrella term for GPS, GLONASS, Galileo, BeiDou) work by trilateration: the receiver measures the time-of-flight to several satellites and solves for its position. Two things wreck this near the ground:

  • Signal blockage. Foliage, walls, and terrain attenuate or fully block the line of sight to satellites. Fewer satellites in view means a weaker geometry and a larger position error — often degrading the dilution of precision until the fix is useless.
  • Multipath. Signals bounce off buildings and wet ground and arrive late, so the receiver mis-measures range. Multipath is insidious because the receiver still reports a confident-looking fix — it is just wrong, sometimes by metres.

The professional answer to GNSS accuracy is RTK (Real-Time Kinematic), which uses carrier-phase measurements and a nearby base station to reach centimetre accuracy. RTK is what makes wire-free mowing viable at all. But RTK needs to resolve integer carrier-phase ambiguities, and under canopy it constantly drops from a fixed solution to a float solution — degrading from 2 cm to tens of centimetres or worse, then losing the fix entirely. You cannot build a reliable robot on a sensor that disappears every time the machine drives under the maple in the corner of the yard.

Key idea

Never trust a single positioning source. The job of a localization system is to fuse many imperfect sensors into one estimate that is better than any of them alone — and to know how uncertain that estimate is.

02Localization as probabilistic belief

The modern framing, formalized in Thrun, Burgard and Fox's Probabilistic Robotics,1 is that a robot does not hold a single pose — it holds a probability distribution over poses, called the belief. Let the state at time \(t\) be \(x_t\) (for a ground robot, the planar pose \(x_t=[p_x, p_y, \theta]^\top\)). The robot receives control inputs \(u_t\) (wheel commands) and measurements \(z_t\) (from its sensors). The belief is the posterior

$$ bel(x_t) = p\big(x_t \mid z_{1:t},\, u_{1:t}\big). $$

The Bayes filter updates this belief recursively in two steps every cycle. First a prediction that pushes the belief forward through the motion model and (always) increases uncertainty:

$$ \overline{bel}(x_t) = \int p(x_t \mid u_t, x_{t-1})\, bel(x_{t-1})\, dx_{t-1}. $$

Then a correction that multiplies in the latest measurement and (usually) decreases uncertainty:

$$ bel(x_t) = \eta\, p(z_t \mid x_t)\, \overline{bel}(x_t), $$

where \(\eta\) is a normalizer. Every practical localization algorithm — Kalman filters, particle filters, factor-graph smoothers — is an implementation of this recursion under different assumptions. Hold onto the two-step rhythm: predict from motion, correct from observation.

03Dead reckoning and the drift problem

The cheapest pose source is dead reckoning: integrate the robot's own motion. Wheel encoders give distance travelled; an IMU gives angular rate and acceleration. A standard differential-drive motion model advances the pose over a timestep \(\Delta t\) given left/right wheel velocities \(v\) and yaw rate \(\omega\):

$$ \begin{aligned} p_{x,t} &= p_{x,t-1} + v\,\Delta t\,\cos\theta_{t-1} \\ p_{y,t} &= p_{y,t-1} + v\,\Delta t\,\sin\theta_{t-1} \\ \theta_t &= \theta_{t-1} + \omega\,\Delta t. \end{aligned} $$

This is wonderfully smooth and high-rate — and doomed on its own. Every measurement has noise, and integration accumulates that noise. Wheel slip on wet grass adds a bias the encoders cannot see. The result is drift: error grows without bound. Heading error is the killer, because a small angular bias swings the position estimate further and further off the longer you drive. Integrating raw IMU acceleration is worse still — position error from accelerometer bias grows roughly with the square of time.

So dead reckoning gives us a great short-term, high-rate motion estimate with unbounded long-term error. GNSS (when available) gives us a noisy but drift-free absolute position. These are complementary. Fusing them is the whole game.

04Fusing sensors with the Extended Kalman Filter

The Kalman filter is the optimal Bayes filter when everything is linear and Gaussian. Robots are not linear — heading enters through sines and cosines — so we use the Extended Kalman Filter (EKF), which linearizes the models at the current estimate. The belief is now a Gaussian with mean \(\mu_t\) and covariance \(\Sigma_t\).

Let \(g(u_t, x_{t-1})\) be the (nonlinear) motion model and \(h(x_t)\) the measurement model. \(G_t\) and \(H_t\) are their Jacobians; \(R_t\) is the process (motion) noise covariance and \(Q_t\) the measurement noise covariance. The prediction step:

$$ \bar\mu_t = g(u_t, \mu_{t-1}), \qquad \bar\Sigma_t = G_t\,\Sigma_{t-1}\,G_t^\top + R_t. $$

The correction step, when a measurement \(z_t\) arrives, computes the Kalman gain \(K_t\) — the optimal blend between prediction and observation — and pulls the estimate toward the measurement in proportion to relative confidence:

$$ K_t = \bar\Sigma_t H_t^\top\big(H_t \bar\Sigma_t H_t^\top + Q_t\big)^{-1}, $$ $$ \mu_t = \bar\mu_t + K_t\big(z_t - h(\bar\mu_t)\big), \qquad \Sigma_t = (I - K_t H_t)\,\bar\Sigma_t. $$

The term \(z_t - h(\bar\mu_t)\) is the innovation — the surprise between what we predicted we would see and what we actually saw. If the GNSS fix is good (\(Q_t\) small), \(K_t\) is large and we snap to it; if the fix is multipath garbage (\(Q_t\) large), \(K_t\) is small and we mostly trust dead reckoning. Crucially, you set \(Q_t\) per measurement: a float RTK solution gets a fat covariance, a fixed solution a tight one. That single mechanism is how the robot gracefully rides out canopy dropouts.

Pseudocode — EKF localization cycle

# State: mu (3x1 pose), Sigma (3x3 covariance)
def ekf_step(mu, Sigma, u, z=None, Q=None):
    # --- PREDICT (runs every cycle, high rate) ---
    mu_bar    = motion_model(mu, u)            # g(u, mu)
    G         = motion_jacobian(mu, u)         # d g / d x
    R         = process_noise(u)               # grows with speed/slip
    Sigma_bar = G @ Sigma @ G.T + R

    if z is None:                         # no measurement this tick
        return mu_bar, Sigma_bar

    # --- CORRECT (runs when a sensor reports) ---
    H         = meas_jacobian(mu_bar)
    innov     = z - meas_model(mu_bar)         # innovation
    S         = H @ Sigma_bar @ H.T + Q        # innovation covariance
    if mahalanobis(innov, S) > GATE:           # reject outliers (multipath!)
        return mu_bar, Sigma_bar
    K         = Sigma_bar @ H.T @ inv(S)       # Kalman gain
    mu        = mu_bar + K @ innov
    Sigma     = (I - K @ H) @ Sigma_bar
    return mu, Sigma

Note the gate: before accepting a measurement we check its Mahalanobis distance against the innovation covariance \(S_t = H_t\bar\Sigma_t H_t^\top + Q_t\). A multipath GNSS reading that disagrees violently with a confident prediction is statistically implausible and gets rejected. Outlier gating is the difference between a filter that is robust and one that lurches into the rose bushes.

Wheel odometryIMU (gyro/accel) Camera (VIO)RTK-GNSS EKF / Factorgraph fusion Pose + covariance
Fig. 1 — Loosely-coupled fusion. High-rate dead-reckoning sensors are corrected by lower-rate absolute references; each measurement enters with its own covariance.

05Adding eyes: visual-inertial odometry

When GNSS is gone entirely, we need another source of drift-resistant motion. The most powerful is visual-inertial odometry (VIO): track visual features across camera frames and fuse that with the IMU. The camera tells you how the world moved relative to you (and therefore how you moved); the IMU fills the gaps between frames and resolves metric scale and gravity direction.

The two sensors are complementary on a sub-second timescale. The IMU is high-rate but drifts in seconds; the camera is lower-rate but anchors absolute structure. Forster et al.'s IMU preintegration2 made tight fusion efficient by summarizing many IMU samples between frames into a single relative-motion constraint, so the optimizer does not have to re-integrate inertial data every iteration. Systems built this way — VINS-Mono, OKVIS, ROVIO — achieve drift on the order of a fraction of a percent of distance travelled, with no external infrastructure at all.

VIO has two failure modes worth designing around, both relevant on a lawn: low texture (a uniform green carpet has few trackable features — point the camera slightly outward to catch edges, beds, and structures) and dynamic scenes (moving objects violate the static-world assumption and must be rejected as outliers).

06Killing drift: SLAM and loop closure

Odometry of any kind — wheel, visual, inertial — still drifts, just slowly. To eliminate drift you need to recognize a place you have been before and snap the trajectory back into consistency. That is SLAM: Simultaneous Localization and Mapping. The robot builds a map and localizes against it at the same time, and the magic ingredient is loop closure.3

Modern SLAM is usually posed as a factor graph and solved by nonlinear least squares (graph optimization). Robot poses are nodes; odometry and loop-closure detections are edges (constraints). The optimizer finds the trajectory \(X=\{x_i\}\) minimizing the total weighted constraint error:

$$ X^\* = \arg\min_{X} \sum_{i,j} \big\lVert f(x_i, x_j) - z_{ij} \big\rVert^2_{\Omega_{ij}}, $$

where \(z_{ij}\) is a measured relative transform between poses \(i\) and \(j\), \(f\) predicts it from the current estimate, and \(\Omega_{ij}\) is its information (inverse-covariance) weight. When a loop closure adds an edge between a current pose and an old one, optimization distributes the accumulated error back across the whole loop — drift visibly snaps out of the map. Feature-based systems like ORB-SLAM4 detect these closures by recognizing previously-seen visual landmarks; libraries such as GTSAM and g2o do the optimization. Cadena et al.'s survey3 is the canonical map of this territory.

The mental model

Odometry answers "how did I just move?" (smooth, drifts). SLAM answers "where am I on the map I'm building?" (drift-free, needs revisits). Absolute sensors like RTK-GNSS answer "where am I in the world?" (drift-free, intermittent). A robust robot runs all three and lets the filter weigh them.

07The NLR localization stack

Put together, a wire-free mower (or any outdoor ground robot) runs a layered estimator with graceful degradation:

  1. Core dead reckoning — wheel odometry + IMU, fused at high rate, providing a smooth pose between everything else.
  2. Absolute correction — RTK-GNSS folded in with a covariance that tracks fix quality: tight when fixed, loose when float, gated out when it disagrees with a confident prior.
  3. Visual-inertial odometry — the drift-resistant fallback that carries the robot through canopy where GNSS is unusable.
  4. Map relocalization — because a mower revisits the same lawn constantly, it builds a persistent feature/elevation map on the first passes and relocalizes against it on every subsequent visit, turning loop closure into an everyday event rather than a lucky one.

The design principle is no single point of failure: each layer covers the others' weaknesses, and the covariance bookkeeping means the planner always knows how much to trust the current pose — and can slow down or stop when uncertainty grows too large near a boundary.

08Measuring whether it works

You cannot improve what you cannot measure. Against a ground-truth trajectory (from survey-grade RTK or a total station), two standard metrics dominate.5 Absolute Trajectory Error (ATE) measures global consistency — the RMS distance between estimated and true positions after alignment:

$$ \mathrm{ATE} = \sqrt{\frac{1}{N}\sum_{i=1}^{N} \big\lVert \hat p_i - p_i \big\rVert^2 }. $$

Relative Pose Error (RPE) measures local drift — how much error accumulates over a fixed window — and is the better gauge of odometry quality. Track both over real field logs, not just simulation: wet grass, low sun glare, and seasonal canopy change are exactly the conditions that break a filter tuned only on a clean test.

09How this transfers to drones

None of the above is specific to mowers. The probabilistic core — predict from motion, correct from observation, track covariance, gate outliers — is the universal grammar of robot state estimation, and it is exactly what flies a drone where GPS can't reach.

↪ Transfers to drones & aerial robots

VIO is how drones fly GPS-denied — indoors, under bridges, in urban canyons, through the same canopy that breaks a mower's RTK. The Forster preintegration math2 was developed on flying robots first. The main differences: a drone estimates full 6-DoF pose in 3D (not planar), has no wheel odometry so it leans harder on VIO and the IMU, and must run the whole loop faster because its dynamics are quicker and a stale estimate means a crash, not a stripe overlap error. The EKF/factor-graph fusion, loop closure, and ATE/RPE evaluation are identical. Learn the localization stack on a ground robot and you have learned the one that keeps a drone in the air.

This is the throughline of NLR's research program: the environment changes — snow, grass, altitude — but the estimation problem is one problem. Solve it well once and you can field a robot anywhere.

Sources & further reading

  1. Thrun, S., Burgard, W., & Fox, D. Probabilistic Robotics. MIT Press, 2005. The foundational text for the Bayes-filter framing and the EKF localization derivation.
  2. Forster, C., Carlone, L., Dellaert, F., & Scaramuzza, D. "On-Manifold Preintegration for Real-Time Visual–Inertial Odometry." IEEE Transactions on Robotics, 2017. doi:10.1109/TRO.2016.2597321
  3. Cadena, C., et al. "Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust-Perception Age." IEEE Transactions on Robotics, 2016. doi:10.1109/TRO.2016.2624754
  4. Mur-Artal, R., Montiel, J.M.M., & Tardós, J.D. "ORB-SLAM: A Versatile and Accurate Monocular SLAM System." IEEE Transactions on Robotics, 2015. doi:10.1109/TRO.2015.2463671
  5. Sturm, J., Engelhard, N., Endres, F., Burgard, W., & Cremers, D. "A Benchmark for the Evaluation of RGB-D SLAM Systems." IROS, 2012. Origin of the ATE/RPE metrics in wide use today.
← All research Next: teaching a robot where the lawn ends →