AI Can Do More Than Build Web Apps: What Hardware Hacking Teaches Us About LLM Capabilities

What happens when you ask an LLM to help build a DIY weather station from scratch?

January 20th, 2026
Share

We undervalue what AI can do

Most AppSec and software teams interact with AI coding assistants in the context of web application development. Ask it to write a React component? Easy. Summarize a Node endpoint? Done. But that narrow perception limits how organizations imagine integrating AI into real engineering workflows, particularly for ecosystems beyond Javascript. This is somewhat expected as Javascript has quickly become the language of both client and server side development for the web.

Hardware projects, where timing issues, signal noise, electrical quirks, and imperfect documentation is  the opposite of a neat, deterministic software task. Unlike building a CRUD route, nothing is guaranteed to work the first time (or the fifth). And that’s  exactly why they’re interesting test beds for evaluating AI’s coding capability.

To explore this, we walk through a real experiment: using Google Gemini to build a fully working RF-decoding weather station using inexpensive sensors, an ESP32 microcontroller, and a stack of AI-generated code. Spoiler: it worked. But the path there is not exactly a straight one, and AI capabilities for code generation beyond the web might be a problem.

1. Build it or Buy It?

Every story in technology starts the same way, with a problem. This problem is fairly simple, a weather station that can integrate with an existing smart home system. Now we have two paths, buy a consumer model, or build it yourself. So instead of starting with a costly purchase, this project began with a cheaper RF weather station, an SDR dongle, and an ESP32. The requirements quickly took shape:

  • Capture 433 MHz RF packets from off-the-shelf weather sensors

  • Decode the proprietary packet structure

  • Write microcontroller code to ingest the data

  • Integrate the readings into Home Assistant

  • Do all hardware wiring and configuration

None of these steps resemble “write me an Express API.”, or a front end for a basic application.

Large language models excel at synthesizing scattered, sometimes contradictory online resources into concrete next steps. Asking Gemini “What’s the first step?” produced a rough but workable action plan, with a collection of 5 books on hardware hacking, ESP32 Programming and RF security:

  1. Capture RF signals with an SDR dongle

  2. Decode the packets using open-source tools

  3. Generate Arduino/ESP32 code for interpretation

  4. Integrate with Home Assistant through ESPHome

This is meaningful because RF decoding isn’t a solved, one-click problem. We have to reason across electrical engineering, RF analysis, microcontrollers, and YAML-based device integrations. AI won’t CAD your PCB, but it will meaningfully reduce the time it takes to move from “I have no idea where to start” to “I have a direction.” 

2. AI can write real embedded code (not just JavaScript)

One of the biggest surprises: the model successfully generated working microcontroller code for the ESP32 within a few prompts. It was able to quickly build a prototype, suggest specific hardware, and configure the components. The first prototype was built quickly, but as the application required debugging it became very messy and challenging to maintain. AI lost context between the code and prompt so getting correct, valid output was a challenge. Despite this with some prompting, AI was still able to:

  • Set up raw RF pulse capture

  • Write an initial timing loop for packet interpretation

  • Implement logic to decode sensor IDs, channels, and temperature values

  • Match open-source decoding logic and rewriting it in Arduino C

  • Iterate on timing bugs introduced by microcontroller speed differences

These are not typical AI-assistant happy paths. Embedded code is extremely sensitive to timing and side effects. A model can’t guess its way into a correct decoding algorithm; it must reason about how microcontrollers process signals in real time. But the model’s debugging ability, or lack there of, reflected a fundamental truth about LLMs:

  • They handle pattern-based reasoning extremely well

  • They struggle with stateful, low-level timing behavior

  • They rely heavily on you to report runtime observations, then adjust their internal model of the world

When a seemingly correct decode loop failed, Gemini confidently explained why, incorrectly of course. When provided new output, it reconstructed its explanation (also incorrectly). But with enough iterations, it reached a working solution. However this was not the AI model alone, but required additional analysis by a developer to explain why its output was incorrect.

3. AI becomes more capable as the task becomes more conceptual

The initial project plan was for the code to leverage Zigbee on the ESP32 C6 to integrate with Home Assistant. The model discouraged that approach and recommended ESPHome instead. This wasn’t a hallucination, it was architectural reasoning based on its memory: Zigbee support on C6 devices is still emerging, while ESPHome has mature support for custom sensors with RF-decoded input.

They can reason about architecture, not just code syntax.

Many engineering teams only ask AI to produce functions. But its real leverage is helping teams evaluate tradeoffs or navigate implementation pathways. In software, this maps directly to design decisions like:

  • picking a framework

  • choosing a dataflow model

  • selecting an authentication strategy

  • shaping interface boundaries

These are high-impact choices, and AI is surprisingly strong at making them for you.

4. AI still needs human guardrails

Even with a fully working weather station, the project ended in the most human way possible: the sensors themselves were dysfunctional. Channel 1’s temperature was wildly wrong, Channel 2 was correct. The AI didn’t fail to generate the code correctly, the hardware failed. This illustrates an important point for developers:

AI can generate correct code for a broken system. Only humans can tell the difference.

This mirrors what we see in application security too. LLMs can produce plausible vulnerabilities or plausible fixes, but without rigorous evaluation frameworks, you risk treating “syntactically correct” and “semantically correct” as interchangeable.

Our research on LLM-based vulnerability detection showed the same: AI produces working output, but not always trustworthy output. The human remains the arbiter of correctness. (See Semgrep’s evaluation of AI Coding Agents: high contextual power, weak deep-flow reasoning.)

5. AI is not replacing engineers — it’s widening what they can attempt

Most non-hardware developers would never attempt RF decoding from scratch. Not because they couldn’t learn it, but because the upfront research cost is very high. Using AI by contrast compresses that:

  • It removes the fear barrier

  • It makes unknown domains feel approachable

  • It lets you prototype beyond your existing specialties

  • It reduces “time to first success”

AI expands the surface area of what one engineer can attempt. Sometimes eliminating learning, but often, the more effective use is to accelerate it to the point where experimentation becomes cheap.

Takeaways for engineering leaders

If your team only uses AI to generate TypeScript, you’re leaving value on the table. AI can push your org further:

  • Rapid prototyping in unfamiliar languages

  • Integrating legacy hardware or proprietary protocols

  • Evaluating architectural paths with incomplete information

  • Automating boilerplate across embedded, backend, and infrastructure layers

  • Debugging through iterative log-driven prompting

  • Generating adapters between incompatible systems

AI should be treated less like a code generator and more like a force multiplier for exploratory engineering.

Final thoughts

This small weather-station experiment illustrates something bigger: AI is fully capable of working beyond the boundaries of web development. It can meaningfully contribute to multi-disciplinary problems involving RF, embedded devices, decode algorithms, and architecture design.

But, and this is key, it succeeds best when paired with human judgment, curiosity, and a willingness to iterate through frustration. If your teams embrace that hybrid model, you’ll unlock projects you’d never previously have attempted.



About

semgrep logo

Semgrep enables teams to use industry-leading AI-assisted static application security testing (SAST), supply chain dependency scanning (SCA), and secrets detection. The Semgrep AppSec Platform is built for teams that struggle with noise by helping development teams apply secure coding practices.