We are at an inflection point. Machine perception once promised a radical simplification of the kill chain: sense, classify, prosecute. In contested skies that promise now bumps against deliberate denial. Jamming, spoofing, sensor degradation and adversarial manipulation do not just reduce accuracy. They change the rules of engagement. Any AI that must name a target while electromagnetic spectrum access is intermittent, cameras are occluded, and radars are being actively deceived must be designed from the ground up for uncertainty, contestability and human oversight.

The recent bloom of autonomy platforms shows both why the problem matters and how fast capability is racing forward. Commercial and defense startups have moved autonomy from lab demos to fleetable stacks, and partnerships between autonomy firms and big primes accelerated in 2025 as platforms like Hivemind entered broader procurement conversations. These moves signal that AI sensing and decision support are already being baked into operational concepts for contested operations.

But the technical literature makes the threat picture clear. Computer vision and object detectors can be coaxed into mistakes by physically realizable perturbations, weather-like artifacts and background manipulations that hide or hallucinate targets. Optical pipelines remain brittle in the face of targeted, physics-aware attacks. At the same time adversaries have shown that radar and mmWave sensors are not magic bullets; waveform-level spoofing and jamming variants can add, remove or move detections in a target track, creating maliciously misleading point clouds. These are not hypothetical edge cases. They are active research demonstrations that expose the surface area attackers can exploit.

So what does resilient target recognition look like? First, modal humility. Relying on a single sensor modality is no longer acceptable for high-stakes identification. Multimodal fusion that blends electro-optical, infrared, radar, RF and inertial cues raises the bar for an adversary because each modality fails differently and under different constraints. The recent literature and engineering pipelines emphasize intermediate and deep fusion approaches that preserve cross-modal cues while enabling fallbacks when one stream is compromised. That fusion must also be active and context aware, not static: if GNSS integrity drops or RF spectrums scream with interference, the fusion stack needs to reweight sensors and raise human-alert tiers automatically.

Second, physics-aware training and sim-to-real practice matter. Synthetic datasets, domain randomization and purpose-built aerial collections have proven their value for pretraining detectors that see the small, distant and occluded. Synthetic ship and hazy-scene datasets show that well-crafted synthetic data can improve robustness for small object detection at altitude. But synthetic realism alone will not solve adversarially crafted attacks. The training pipeline must incorporate adversarial scenarios generated not only in pixel space but at the sensor and waveform level.

Third, uncertainty must be first-class. Modern target recognition needs calibrated likelihoods, out-of-distribution detection and causal-style reasoning about evidence. A classifier that returns a confident label with no provenance is a liability in contested operations. When a fused system reports “10 percent probability hostile” with an accompanying explanation of which sensors drove that estimate and what they saw, a human supervisor can make an informed risk call. That kind of transparency is the operational imperative behind explainable AI programs and NIST style risk frameworks for AI engineering, which urge measurement, governance and continuous monitoring of model behavior.

Fourth, adversarial and electronic warfare threats require active detection and countermeasures. Researchers have demonstrated both physical adversarial patches for cameras and black-box waveform attacks on FMCW mmWave radars. Defenders must instrument sensors to detect signs of manipulation, deploy cross-checks across modalities, and incorporate deception-aware filters into tracking pipelines. In practice that means rapid TEVV cycles for models, red-team engagements that include spectrum and physical-layer adversaries and mission-level playbooks that define human authorities when automation encounters contested inputs.

Fifth, human-machine roles must be recast. Past debates framed autonomy as either full replacement or strict human override. The better operational model for contested skies is adaptive authority. AI should own high-frequency perception and short-latency tracking tasks while humans retain authority over lethal decisions and ambiguous target labels. That supervisory architecture demands interfaces that make AI uncertainty legible and that allow graceful handover under degraded comms. It also demands policy and rules of engagement that are synchronized with the system’s technical capabilities and failure modes.

What should researchers and funders prioritize now? Invest in multimodal TEVV pipelines that include electromagnetic and physical-layer red teams. Fund research that ties sensor physics into learned models so that perception systems are aware of how their inputs could be spoofed. Build operational datasets that include contested-spectrum conditions and adversarial artifacts and make curated challenge sets available to the community so robustness can be benchmarked honestly. Finally, adopt risk management and governance practices such as those in NIST’s AI RMF so deployment decisions are traceable, measurable and auditable.

The sky in a future conflict will be noisy. It will be crowded with cooperative and non-cooperative aircraft, with decoys, loitering munitions and cheap swarms. AI can be the difference between timely, accurate recognition and a cascade of misidentifications that escalate conflict by mistake. But that AI must be engineered for contestability. Resilience is not chiefly a data problem or a compute problem. It is an architectural and cultural problem. It requires cross-disciplinary teams that understand RF engineering, sensor physics, machine learning, human factors and the legal and ethical contours of use.

Takeaway: build perception stacks that assume they will be lied to. Design systems whose uncertainty can be inspected. Train with synthetic realism and adversarial rigor. And hardwire human authority where the stakes are highest. If we can do those things, AI will not make war inevitable. It will make decisions in war better informed, more auditable and less prone to catastrophic error.