How I built a law school directory with Lovable

A fully searchable law school directory created from 200+ ABA 509 reports, built with a modern low code stack using Python, Supabase, Airtable, and Lovable.

5 min read

May 19, 2025

Author

Megan Johnson

Megan is a Technical Content Marketer.

Copy Link

I've always dreamed of going to law school. But trying to compare schools? That part is a nightmare.

There’s a ton of information out there, LSAT medians, GPA percentiles, acceptance rates, tuition, scholarships, but it’s scattered across websites and buried deep in PDFs. The official ABA 509 reports are comprehensive, but unless you’re willing to open and read hundreds of files manually, they’re not exactly user-friendly.

So I decided to fix that.

I scraped and processed every single 2024 ABA 509 report, extracted the key stats, and then vibe-coded a law school directory that's clean, searchable, and actually helpful.

My vibe coding tech stack consisted of Google Collab, to write Python, Lovable for the frontend, Supabase for my database, Airtable as my admin panel, and Whalesync, of course.

In this tutorial, I’m going to outline how I scraped the data from PDFs, created the directory, synced my data from Airtable to Supabase and published my site.

If you ever want to build something similar, whether it’s a searchable directory, a data-powered app, or just wrangling a bunch of PDFs into something useful, you’ll know exactly how to do it.

Step 1: Understand what you’re scraping

Every year, the American Bar Association (ABA) publishes “509 Reports” that include key stats about every accredited U.S. law school:

LSAT scores
Undergraduate GPA ranges
Acceptance rates
Scholarship data
Tuition
School website

These reports are incredibly valuable, but they’re buried inside individual PDFs for each school, making them hard to compare at a glance. My goal was to turn those unstructured PDFs into a structured CSV file I could use to power a searchable, filterable Law School Directory.

If you’re building something different, say, a grant funding dashboard based on government reports, the principle is the same. You need to understand where the data lives, how it’s formatted, and what you’ll need to extract to make it actually useful.

Step 2: Set up the environment

We need to write and run a Python script to extract all of the data to start building our database.

I used Google Colab to run my Python scripts, since it’s a convenient way to code in the cloud. I really recommend using Google Colab, especially if you’re not familiar with IDE tools like VS Code.

This command installs three Python libraries that are essential for scraping websites and reading PDFs.

P.S you don’t need to know how to write Python, you can get an LLM to write the Python for you. After all, we’re vibe coding!

python
!pip install requests beautifulsoup4 PyMuPDF

Step 3: Download the PDFs programmatically

The ABA hosts 509 reports in two ways:

A central directory page with individual links
A backend API that generates PDFs on demand

I chose the API route—it’s faster, cleaner, and more reliable.

But here’s the catch: there are over 200 separate PDFs, one for each school. So instead of downloading them manually, I wrote a Python script to fetch them all automatically.

import os, requests, time

# Create directory
os.makedirs("ABA_509_2024_PDFs", exist_ok=True)

# API endpoint
url = "https://backend.abarequireddisclosures.org/api/
AnnualQuestionnaire/GenerateIndividualDisclosure509Report"

# Loop through possible school IDs
for school_id in range(1, 205):
    params = {"schoolId": school_id, "year": 2024}
    try:
        res = requests.get(url, params=params, timeout=10)
        if res.status_code == 200 and res.headers
        ["Content-Type"] == "application/pdf":
            with open(f"ABA_509_2024_PDFs/ABA_509_2024_School_
            {school_id}.pdf", "wb") as f:
                f.write(res.content)
            print(f"Downloaded School {school_id}")
        else:
            print(f"Skipped School {school_id}")
        time.sleep(0.5)
    except:
        print(f"Error with School {school_id}")

Now I have 200+ law school PDFs downloaded to my computer!

Step 4: Extract the data with PyMuPDF

Parsing data from PDFs is messy. To do that, I used PyMuPDF (imported as fitz), which is a powerful library for extracting text from PDF files page by page. Again, you can use an LLM, like ChatGPT to direct you.

Open the PDF and extract all text

I used .get_text() on every page and stitched the results into a single string.

doc = fitz.open(pdf_path)
text = "".join(page.get_text() for page in doc)

Use Regular Expressions (re) to find key values

PDFs don’t have tags like <title> or <table>, so I used regex to match patterns in the text. For example, to find the school name:

match = re.search(r"([A-Za-z\s&'.]+?)\s*[-–—]\s*2024\s+Standard\s+509", text)

This searches for a name followed by a dash and the string “2024 Standard 509”.

Store the matched value (or return “N/A” if not found)

match = re.search(r"([A-Za-z\s&'.]+?)\s*[-–—]\s*2024\s+Standard\s+509", text)

Repeat for each data point

I used different regex patterns to extract:

Location
Median LSAT
Median GPA
Acceptance rate
Scholarship amounts
Tuition
School website

Step 5: Export to CSV

After extracting all data points, I converted the results into a CSV for my Supabase database.

df = pd.DataFrame(results)
df.to_csv("aba_509_summary_2024.csv", index=False)

Unfortunately, my Python skills aren’t that great (yet) and a lot of data points was missing, I did try to debug with the help of ChatGPT, but I wasn’t able to get very far. Nonetheless, I exported the data to a CSV file.

Step 6: Import CSV to Supabase

I imported my CSV to Supabase and once imported, I previewed the table and confirmed that fields like school_name, lsat_50th, tuition, etc. were properly typed (text, number, etc.). Supabase even auto-generates an id column for you.

You can see the missing data points in my database. The missing data also reflects on my live website. But, it’s an easy fix!

In order to fix this, I created an admin panel so I could enter the missing data into Airtable, which will automatically sync to Supabase. Here’s how I set up my sync.

Step 7: Sync Airtable and Supabase

I’m going to set up a two-way sync from Supabase to Airtable using Whalesync. I need a two-way sync because I want to get the initial data from Supabase into Airtable, so I can easily view my data to see what values are missing.

I also want to use Airtable as my Supabase admin panel, I want to be able to edit my data in Airtable and have the data automatically synced to Supabase, which is the backend for my directory.

Create Airtable database

First, I created my Airtable database, with the same columns as my Supabase database.

Map the table

Next, I mapped my tables to ensure that the values in each column would sync from Supabase to Airtable.

Once this was done, Whalesync scanned my initial records to determine what records will be added.

I hit ‘Activate sync’ and now my Airtable database has all of my data from Supabase!

Now it’s time to start editing my data.

Step 8: Enter your data

As I mentioned before, I didn’t get all of the data I wanted when I initially scraped my PDFs, and half way through vibe coding, I realized that 509s don’t include bar passage and employment rates. That data is found across other ABA disclosures.

However, moving forward, I am going to update my directory from Airtable.

As you can see here, Baylor Law School is missing a lot of data. Let’s edit this on Airtable.

I’ve added in my data on Airtable

This will automatically sync to Supabase and will be live on my directory.

As this is an ongoing side project, I’m going to continue adding data whenever I get a free chance. All changes will sync in real time and will appear live on my site!

Step 9: Deploy your directory

Once everything is working locally and syncing properly, it’s time to deploy! I used Vercel, an easy to use platform that enables you to deploy apps in seconds with one click.

Connect the GitHub repo

I synced my Lovable project to a GitHub repository.

I then linked my repo to Vercel. Vercel automatically detected the framework and configured the build settings.

Click “Deploy”

Literally. One click, and Vercel handled the rest.

Add a custom domain (optional)

I purchased a domain on GoDaddy (lawschooldatahub.com) and connected my own domain name through Vercel.

Step 10: Connect your domain (optional)

Connecting the domain was a little bit tricky for me, and took me a while to figure out. Here’s what I did:

Go to project settings and click ‘Domains’

Click on ‘Add domain’

Type in your domain name (e.g., lawschooldirectory.com) and click ‘Add’.

Vercel will now show you the DNS records you need to add in GoDaddy.

Update DNS settings in GoDaddy

Click DNS next to your domain name.

You’ll now be in the DNS Management panel.

Now, add the following records from Vercel:

For apex domain (`yourdomain.com`)

For subdomain (`www.yourdomain.com`)

If GoDaddy already has a conflicting A or CNAME record, delete those first.

Back in Vercel, once your DNS changes propagate (this can take 5–30 minutes), you’ll see a tick next to your domain name. That means it’s connected!

My directory is now live!

What’s next?

This is still a work in progress. I’ll keep updating the data, adding new fields like bar passage and employment stats, and making the directory even more useful for prospective law students. I might even go ahead and upgrade the UI.

If you're thinking of building something similar, I hope this tutorial showed you that it’s not just possible, it's doable, even if you're not a “developer.”

One of the biggest lessons I’ve learned while building this? It’s not as hard as you think. You just need access to the right tools like Supabase, Airtable, Whalesync, Vercel and Lovable.

I can’t wait to see what you build!