busibee.be: a free search engine for Belgian company data

I launched a new side project: busibee.be. It pulls public company data from the three official Belgian government sources (the Crossroads Bank for Enterprises, the National Bank, and the Belgian Official Gazette) and makes it all searchable in one place. Type a company name, a VAT number, an enterprise number or an address and you get a page with financials, official publications, directors, activity codes, PEPPOL status and more.

I've always found Belgian company data fascinating, but working with it is a pain. It's technically public, yet scattered across multiple government systems, each with their own quirky interface. The existing commercial alternatives are decent but expensive, and usually locked behind accounts and paywalls. I figured: if this data is public, access to it should be public too. No account, no trial, no credit card.

What you can do with it

Search by whatever you have and you land on the company page. The usual identification stuff is there (legal form, registered address, start date, PEPPOL), but the parts I find most interesting are the financials and the publications.

For financials, busibee shows a multi-year table covering assets, equity, revenue, gross margin, net profit, cash and debt, along with the main ratios (debt ratio, current ratio, quick ratio) as sparklines and an overall health score from 0 to 100. All of that is pulled from the balance sheets filed at the National Bank.

For publications, every legal act a company has ever filed in the Belgian Official Gazette is indexed: incorporations, capital changes, board appointments, dissolutions, and so on. I run them through an LLM to extract the subjects so you can see at a glance what each publication is about. You can download the original PDFs too.

On top of that you can browse companies by region, province, city, street, or by NACEBEL activity code. There's a trending page showing which companies got the most visits today, this week or this month. Everything is available in Dutch, English, French and German.

For developers and AI folks there's also a REST API (60 requests/hour without a key, API key on request for more) and a public MCP server you can drop into Claude Desktop, ChatGPT, Cursor or any other MCP client.

Why I like building these kinds of projects

I really enjoy data-heavy projects like this. They generate an insane amount of leaf pages: every company gets a page, every city, every street, every activity code. That's hundreds of thousands of pages, each one a valid, useful result for someone's search query. Google likes it, I like watching the sitemap grow, and every now and then a page I didn't expect shows up in my analytics because somebody searched something oddly specific.

The funny part: looking at the traffic, most of it isn't human. The majority of requests come from bots and AI crawlers. Google, Bing, ClaudeBot, GPTBot, Meta's crawler, Applebot, they're all hammering the site continuously, ingesting company data into their indexes and training corpora. I don't really mind. I built this to make Belgian company data more accessible, and if LLMs end up using it as a source that just means more people getting correct answers about Belgian businesses when they ask an AI.

Running on my own mini server at home

The whole thing is self-hosted on the ASUS NUC 14 Pro in my home, running over my regular Telenet residential connection. No cloud bills, no scaling worries, just a small box doing its thing. Cloudflare sits in front for caching which takes a lot of pressure off the upload bandwidth, and so far it's holding up just fine. No downtime, no Telenet complaining about my traffic (not yet at least haha).

There's also a little live status page with all the nerdy stats you'd expect – requests, visitors, cache hit rate, bandwidth, energy used, external service call rates, database record counts etc etc (linked in footer).

I'll probably write a separate post about the stack at some point, there's more going on behind the scenes than I expected when I started (data pipelines, queues, a search engine, LLM extraction, PDF parsing), but that's for another day. For now, go have a look, search for your employer, your accountant, your neighbour's bakery, and let me know what you think 🎉

Comments

Join the conversation by sharing on BlueskyBluesky

Aranet4
Power
Car
NUC