AI & ML interests

Text classification, relations extraction, NER, computational biology

Recent Activity

Ihor 
posted an update about 2 months ago
view post
Post
301

🧠 One Model to Classify, Verify, and Guard — Meet GLiClass-Instruct

When we first released GLiClass, it was a fast, zero-shot text classifier that could rival cross-encoders at a fraction of the cost. But classification alone wasn't enough. Our real ambition was a single, lightweight model that could handle the diverse range of text-understanding tasks via classification.
We are excited to announce GLiClass-Instruct – a leap forward that transforms GLiClass from a classifier into an instruction-following, multi-task engine.
What's new:
▪️ Hierarchical labeling: organize labels into structured groups for complex taxonomies
▪️ In-context learning via examples: provide few-shot examples to adapt on the fly, no fine-tuning needed
▪️ Prompting support: guide classification behavior with natural-language task descriptions
▪️ EWC for preventing catastrophic forgetting: add new capabilities without losing old ones
▪️ 3x faster inference thanks to FlashDeBERTa
New multi-task capabilities:
Beyond topic classification and sentiment analysis, GLiClass now supports:
▪️ Hallucination detection: verify whether LLM outputs are grounded in context
▪️ Rule-following verification: check if content complies with custom guidelines
▪️ Safety classification: detect prompt injections, jailbreaks, and harmful requests
These tasks are crucial for building reliable and efficient agentic systems, where every LLM output needs to be verified, every user input needs to be screened, and every response needs to follow the rules, all at minimal latency.
We release 3 instruction-following models (edge, base, large), with the large model matching SoTA classification models while unlocking entirely new task categories.
🔗 Explore more:
GitHub repo: https://github.com/Knowledgator/GLiClass
Models: https://huggingface.co/knowledgator/gliclass-multitask-large-v1.0
Our other solutions: https://www.knowledgator.com/
Ihor 
posted an update 2 months ago
view post
Post
302
Meet **GLinker** — an ultra-fast, modular, **zero-shot entity linking** framework 🚀

When we introduced the **GLiNER bi-encoder** in 2024, it enabled efficient zero-shot NER across hundreds of entity types. But that was just the beginning. Our bigger goal was always clear: **linking text to millions of entities dynamically, without retraining**.

In other words: **true entity linking at scale** ⚡

This unlocks powerful applications:
▪️ More precise search with real-world entity disambiguation
▪️ Knowledge graph construction across diverse document collections
▪️ Wikification — turning raw text into richly linked, navigable knowledge

After nearly two years of research + engineering, this vision is now real.

We’re excited to release **GLinker** — a **production-ready**, zero-shot entity linking system powered by our novel **GLiNER bi-encoder**. It efficiently detects entity spans of any length and matches them directly to entity descriptions — **no retraining required**.

**Why GLinker?**
▪️ Production-ready: multi-layer caching (Redis → Elasticsearch → PostgreSQL)
▪️ Research-friendly: fully configurable YAML pipelines
▪️ High performance: precomputed embeddings for bi-encoder models
▪️ Scalable by design: DAG-based execution + efficient batch processing

GLinker transforms raw text into **structured, disambiguated entity mentions**, bridging unstructured language with large, evolving knowledge bases.

🔗 Explore more:
GitHub: https://github.com/Knowledgator/GLinker
Report: https://github.com/Knowledgator/GLinker/blob/main/papers/GLiNER_bi_Encoder_paper.pdf
Linking models: https://huggingface.co/collections/knowledgator/gliner-linker
Bi-encoder models: https://huggingface.co/collections/knowledgator/gliner-bi-encoder
Ihor 
posted an update 5 months ago
view post
Post
1327
Hey builders 👷‍♀️

We’re Knowledgator, the team behind open-source NLP models like GLiNER, GLiClass, and many other used for zero-shot text classification and information extraction.

If you’ve explored them on Hugging Face or used our frameworks from GitHub, we’d love your input:
🧩 Which of our models, like GLiNER or zero-shot classifiers, do you find helpful in your practical workflows?
🧩 How’s the setup, performance, and accuracy been for you?
🧩 Anything confusing, buggy, or missing that would make your workflow smoother?

Your feedback helps us improve speed, clarity, and stability for everyone in the open-source community.

💬 Comment directly here or join the discussion. We read every one 😉:
GitHub: https://github.com/Knowledgator
Discord: https://discord.gg/GXRcAVJQ
HuggingFace:
knowledgator


📝 Want to shape our next release?
Click here to complete this 2-min survey: https://docs.google.com/forms/d/e/1FAIpQLSdyz2UMHrMDX8S9stpBk0wyfngtKSYzwk-02mN1VNYDdTw8OQ/viewform