kijijijini

appraise your stuff!

About the site:

Have you ever wanted to sell a coffee table on Kijiji but didn't know what price to ask?

I sure have.

kijijijini is an application to estimate the price for an item listing based on the item type, title, and description, using machine learning! The tool scrapes 100,000s of listing from the site from which to learn.

Ways to use kijijijini:

  • Enter a draft of the listing you want to post to get a price suggestion.
  • Enter the url or id of a listing on Kijiji to see a price estimate and compare to the asking price.

About the data:

kijijijini scrapes listings by querying each of the roughly 200 categories of items in the "Buy & Sell" section of Kijiji for the Toronto area. We skip sponsored listings and listings that don't offer a specific price. There hasn't been an API for Kijiji for many years, so we scrape from the results pages, which have 40 ads per page. This is a slow process because Kijiji takes a long time to respond to each request (~5 seconds), rate-limits responses, and is generally unreliable. But we persevere!

Only 100 pages (4000 listings) are accessible for a given query, even though some categories have 10,000s of listings at any given time (listings expire after 60 days if they aren't taken down by the seller). More specific searches (using keywords) could access older results but I don't know of a good systematic way to do this. With frequent scrapes, this shouldn't be a problem.

Because listings are scraped from the search results page and not the individual listings pages, only the truncated descriptions are saved (up to 200 characters). Descriptions entered by a user into kijijijini are also truncated this way. It's a good rule of thumb to put the important info in the first 200 characters since that's all that will appear in the search page preview. Hook them before they click!

About the model:

kijijijini uses a linear model to predict price based on the category of item, and certain words appearing in the title and in the description of a listing. There is an overhead cost to a buyer to view and retreive an item in person (most sellers do not offer delivery). With some experimentation, $25 seems to be roughly the right amount. The model predicts the logarithm of the price plus this overhead amount.

Note that the prediction isn't a guess of what the item is truly worth, or the maximum price it will sell for, but the amount that is most in line with previous listings on the site, whether they were priced correctly or not. There is also a lot that the model doesn't take into account (e.g. how nice an item looks in the photos). So don't be discouraged if it spits out a price for your item that's lower than you think it's worth.

About me:

kijijijini is written by me, Robert Krone. I'm a mathematician and nice person. I made this as the Capstone project for the Data Incubator Fall 2020 Fellowship Program. Feel free to grab the code from github and do whatever you want with it.

glhf