Experience

A detailed overview of my research collaborations, industry work, open-source contributions, and teaching — focused on efficient, interpretable, and biologically-grounded learning systems.

My research sits at the boundary of computational neuroscience, reinforcement learning, and efficient sequence modeling. I work on building systems that learn compact, stable representations capable of generalizing across tasks without catastrophic forgetting, while remaining interpretable and deployable on constrained hardware. Below are the key threads and collaborators.

Spiking Decision Transformers

Collaboration with Debasmita Biswas · Purdue University

Designed spiking formulations of decision transformers that replace dense softmax attention with LIF (Leaky Integrate-and-Fire) neurons trained via surrogate gradients. The architecture incorporates phase-coding for temporal feature binding and dendritic routing for low-power sequence control, targeting deployable RL agents on energy-constrained hardware. Published as Spiking Decision Transformers: Local Plasticity, Phase-Coding, and Dendritic Routing for Low-Power Sequence Control.

2025 — Present

Trajectory Geometry of Transformer Representations

Collaboration with Gopal Singh (Metriqual) & Yacine Mahdid (DataDom)

A geometric analysis of how token representations evolve through transformer depth — characterizing curvature, drift, and separability of representation trajectories across layers. This work provides new tools for understanding the internal mechanics of deep transformers beyond standard probing methods (arXiv 2606.09287).

2026

Variational Linear Attention

Collaboration with Gopal Singh · Metriqual

Developed a variational recurrent memory update with provably unit Jacobian spectral norm, enabling stable long-context reasoning without softmax's quadratic cost. The method reformulates linear attention as a variational inference problem over associative memory states, providing both theoretical stability guarantees and practical efficiency gains. Published as Variational Linear Attention: Stable Associative Memory for Long-Context Transformers (arXiv 2605.11196).

2025 — Present

AgroSense2.0 - Precision Agriculture

Collaboration with Ruzina H. Laskar · CDOT & Rishav Tewari

Built an integrated deep learning system combining soil image analysis with nutrient profiling for crop recommendation under noisy, field-collected inputs. The follow-up, AgroSense 2.0, introduces cross-modal transformer fusion with geospatial raster integration and interpretable multi-task learning for precision crop recommendation (arXiv 2606.21892). AgroSense 1.0 has received 7 citations.

2025 — Present

AgroSense v1 - Deep Learning for Crop Recommendation

Collaboration with Prof. Ranjita Das · NIT Agartala

The original AgroSense system: an integrated deep learning pipeline for crop recommendation via soil image analysis and nutrient profiling. Joint work with Prof. Ranjita Das at NIT Agartala, published as AgroSense: An Integrated Deep Learning System for Crop Recommendation via Soil Image Analysis and Nutrient Profiling (arXiv 2509.01344, 7 citations).

2025

Interpretable Clinical ML — Diabetes Pipeline

Collaboration with Ruzina H. Laskar · CDOT & Rishav Tewari

Designed a unified three-stage ML framework for diabetes detection, subtype discrimination, and cognitive-metabolic hypothesis testing. The pipeline emphasizes SHAP-grounded per-patient attribution and distribution-shift robustness over raw benchmark performance (arXiv 2605.13464).

2026

Meta Research - Distributed Shampoo Optimizer

facebookresearch/optimizers · Co-author

Contributed to Meta's open-source optimizer library by implementing Muon Grafting within the Distributed Shampoo optimizer framework. The work involved integrating a novel grafting strategy into the production-grade distributed optimizer codebase, resulting in merged PRs and a co-author credit on the repository.

2025

All other code, experiments, and reproducible artifacts live on GitHub.

Graduate Engineer - Farvision ERP

Enterprise Software

Worked as a Graduate Engineer at Farvision ERP, an enterprise resource planning platform, where I was responsible for maintaining and optimizing database workflows under live production constraints. This involved managing complex data pipelines, debugging production issues in real time, ensuring data integrity across interconnected modules, and working with enterprise-scale relational databases to keep critical business processes running smoothly.

Jul 2023 - Aug 2024

Research Intern - NIT Mizoram, Department of Computer Science

In collaboration with CDAC (Centre for Development of Advanced Computing)

Interned at the Department of Computer Science, NIT Mizoram, in collaboration with CDAC. Built Signature Verification Models using government-constrained datasets to develop efficient and reliable authentication systems. The work focused on designing models that operate under strict data governance requirements while maintaining high verification accuracy.

Additionally, mentored 5 PG-DAI (Post Graduate Diploma in AI) students in the Computer Vision class, guiding them through foundational CV concepts, hands-on implementations, and project-based learning.

Nov 2022 - Apr 2023

Technical Blog - Deep Dives & Tutorials

vayishu.bearblog.dev

Writing in-depth technical posts on topics including Self-Supervised Learning, Mixture of Experts, KV Caching, LoRA & QLoRA, Prefix Tuning, and REFRAG: Recursive Fragmentation for Efficient Retrieval-Augmented Decoding. Also writing essays on AI philosophy, such as Towards Autonomous Preference Formation in AI.

2025 - Present