BLOG / ARTICLE

Merlin: Bridging the reproducibility gap in Quantum Machine Learning

Merlin is a community framework for systematic, reproducible quantum machine learning research — built to close the gap between QML claims and working code. As QML grows rapidly, reproducibility failures and a strong bias toward gate-based systems leave critical questions unanswered, especially in photonic quantum computing. In this blog post, we introduce Merlin’s design philosophy, its photonic-first approach, and the benchmarking workflow that is already enabling collaborations across academia and industry.

The reproducibility gap

If you read the QML literature, you will frequently encounter strong claims: improved accuracy, better scaling, even hints of quantum advantage. But turning those claims into working code is often surprisingly difficult. Even when implementations are shared, they may be incomplete, unmaintained, or subtly inconsistent with the description in the paper. This reflects a broader reproducibility challenge across AI research (Henderson et al., 2018), amplified here by the additional complexity of quantum systems.

If results cannot be reproduced, they cannot be trusted.

A Blind Spot: Photonic Quantum Computing in QML

Another limitation of the current QML landscape is its strong bias toward gate-based quantum computing. While this paradigm is important, it often overshadows alternative approaches — particularly photonic quantum computing.

Despite its promise, photonic QML remains underrepresented in the literature. Native photonic algorithms are rarely explored or benchmarked systematically. At the same time, researchers lack unified tools to test models across platforms or to experiment with hybrid quantum–classical schemes in a consistent way.

This is a missed opportunity.

Photonic systems offer compelling advantages for QML:

  • They are among the leading candidates for quantum advantage demonstrations
  • They naturally explore large Hilbert spaces via bosonic modes
  • They align well with certain machine learning primitives

Why Merlin

Merlin was built to address these gaps.

Not as a silver bullet, but as a practical tool for systematic exploration.

The idea is simple:

If we want to discover whether QML brings real advantages,
we need to test many models — quickly, rigorously, and reproducibly,

But that’s not it! Reproducibility is essential but just the first step.

The final goal is replicability, meaning we want to bring QML methods from the gate-based world to photonic quantum computing. We want to go beyond the existing results and build on them the next QML models.

Watch the Webinar Replay

Want to see Merlin in action? We presented the framework live, covering motivation, architecture, and early results.

What Merlin Enables

Merlin provides a framework to:

  • Implement QML models with a PyTorch-like interface
  • Build hybrid quantum–classical pipelines
  • Run systematic benchmarks across models and datasets
  • Focus specifically on photonic quantum computing workflows

Photonic systems are particularly important here because they:

  • Are leading candidates for quantum advantage experiments
  • Offer natural compatibility with certain ML subroutines
  • Enable new hybrid model architectures

Usual workflow in Merlin

  1. Start from an idea.
    Experiments typically begin with a paper, a concrete use case, or a new theoretical insight worth exploring.
  2. Ground the problem.
    Rebuild baselines, compare against strong classical methods, and clarify what actually drives performance.
  3. Adapt and implement.
    Translate models into photonic architectures and integrate them into hybrid quantum–classical pipelines.
  4. Benchmark thoroughly.
    Evaluate accuracy, parameter efficiency, scaling behavior, and data requirements.
  5. Stress-test with ablations.
    Remove or replace the quantum component to assess its real contribution.
  6. Validate reproducibility.
    At every stage, ask: can the result be reproduced reliably?

A Different Way to Approach QML

Merlin is more than a tool, it’s a way of working.

It treats QML as empirical rather than purely theoretical, iterative rather than definitive, composable rather than monolithic. Our goal is to bring to QML what already exists in classical; a library of reusable, testable, and extensible building blocks.

Progress in ML didn’t come from a single breakthrough, but from combining ideas. For example decades of innovation piled up before LLM became as powerful as we know them today:

   ↓

1986: Backpropagation (learning deep networks)

   ↓

2003–2013: Word embeddings (semantic representations)

   ↓

2014: Attention (dynamic focus)

   ↓

2017: Transformers (self-attention at scale)

   ↓

2020+: Scaling laws (predictable progress)

We believe QML will evolve the same way.

A Growing Community Effort

This is not a closed project. In just a few months we reproduced almost thirty papers, we made accessible a set of components and formed collaborations in academia and industry around QML. As Sam Stanwyck (NVIDIA) puts it:

“Powerful simulation tools are essential to develop better algorithms and accelerate the path to broad quantum advantage.
Merlin solves a critical ecosystem need by opening the door for the broader research community to develop with photonic quantum circuits.”

What Comes Next

This post is the starting point of a series exploring QML papers through reproducibility using Merlin. We’ll share on a monthly basis the most promising reproduced papers. Get in touch if you want to participate in this initiative.


References

  1. Blum, A., & Rivest, R. L. (1992). Training a 3-node neural network is NP-complete. Neural Networks, 5(1), 117–127.
  2. Bowles, J., et al. (2025). Backpropagation for parameterized quantum circuits with structured architectures. (Preprint / to appear).
  3. Cerezo, M., Arrasmith, A., Babbush, R., Benjamin, S. C., Endo, S., Fujii, K., McClean, J. R., Mitarai, K., Yuan, X., Cincio, L., & Coles, P. J. (2021). Variational quantum algorithms. Nature Reviews Physics, 3, 625–644.
  4. Froese, T., et al. (2023). On the computational hardness of training shallow neural networks. (Preprint).
  5. Henderson, P., Islam, R., Bachman, P., Pineau, J., Precup, D., & Meger, D. (2018). Deep reinforcement learning that matters. Proceedings of the AAAI Conference on Artificial Intelligence, 32(1).
  6. Huang, H.-Y., Kueng, R., & Preskill, J. (2021). Information-theoretic bounds on quantum advantage in machine learning. Physical Review Letters, 126(19), 190505.
  7. Larocca, M., et al. (2024). Diagnosing barren plateaus in quantum machine learning.
  8. McClean, J. R., Boixo, S., Smelyanskiy, V. N., Babbush, R., & Neven, H. (2018). Barren plateaus in quantum neural network training landscapes. Nature Communications, 9, 4812.
  9. Mitarai, K., Negoro, M., Kitagawa, M., & Fujii, K. (2018). Quantum circuit learning. Physical Review A, 98(3), 032309.
  10. Salavrakos, A., Maring, N., Emeriau, P.-E., and Mansfield, S. (2025). Photon-native quantum algorithms. Materials for Quantum Technology, vol. 5, no. 2, Art. no. 023001, IOP. doi:10.1088/2633-4356/adc531.
  11. Schuld, M., Bergholm, V., Gogolin, C., Izaac, J., & Killoran, N. (2019). Evaluating analytic gradients on quantum hardware. Physical Review A, 99(3), 032331.

Latest from the blog