Nano Language Models
Collection
A collection of really small language models pre-trained from scratch with open-data. Ideal for use in experimentation and comparisions. • 4 items • Updated
• 1
A 165M parameter Memory-Augmented Language Model (MALM) for semantic code search, trained on CodeParrot.
# Install dependencies
pip install mlx huggingface_hub numpy
# Download model
huggingface-cli download codelion/malm-165m --local-dir ./malm-165m
# Run semantic search
python malm-165m/inference.py --query "function that sorts a list"
Example output:
Query: function that sorts a list
------------------------------------------------------------
1. array_sort (score: 0.9526)
Signature: array_sort(col)
Docstring: Collection function: sorts the input array in ascending order...
2. sort_array (score: 0.7707)
Signature: sort_array(col, asc)
Docstring: Collection function: sorts the input array in ascending or descending order...
from huggingface_hub import snapshot_download
from pathlib import Path
import sys
# Download and import
model_path = snapshot_download("codelion/malm-165m")
sys.path.insert(0, model_path)
from inference import load_model, search_functions
# Load model
model, tokenizer, functions, config = load_model(Path(model_path))
print(f"Loaded {len(functions)} functions")
# Search
results = search_functions(
model, tokenizer, functions,
query="connect to database",
top_k=5
)
for name, signature, docstring, score in results:
print(f"{name}: {score:.4f}")
MALM combines a transformer with learned memory retrieval for semantic code search:
MALM uses a memory-augmented architecture different from standard LLMs:
This architecture doesn't fit mlx-lm generate, so we provide a custom inference script.
| Component | Parameters |
|---|---|
| Embedding | 11.1M |
| Position Embedding | 0.1M |
| Query Encoder (4 layers) | 28.4M |
| Value Encoder (4 layers) | 28.4M |
| Decoder (12 layers) | 85.1M |
| Output Projection | 11.1M |
| Total | ~165M |
{
"vocab_size": 14407,
"d_model": 768,
"n_heads": 12,
"n_layers": 12,
"n_query_layers": 4,
"max_seq_len": 128,
"num_parameters": 165123656,
"num_functions": 2000
}
| File | Description |
|---|---|
model.npz |
Model weights (MLX-compatible NumPy format) |
config.json |
Model configuration |
tokenizer.json |
Tokenizer vocabulary |
functions.json |
Memory bank of 2000 Python functions |
inference.py |
Standalone inference script |
Trained on CodeParrot with a focus on Python function retrieval:
@article{sharma2026malm,
title={Reverse Engineering a $500M Mystery: From HashHop to Memory-Augmented Language Models},
author={Sharma, Asankhaya},
year={2026},
url={https://huggingface.co/blog/codelion/reverse-engineering-magic-hashhop}
}
Part of the HashHop project exploring long-context evaluation and memory-augmented architectures.
Apache 2.0