[HN Gopher] Show HN: A Who is Hiring app with AI filters
___________________________________________________________________
Show HN: A Who is Hiring app with AI filters
Author : bernawil
Score : 51 points
Date : 2024-01-03 19:10 UTC (3 hours ago)
(HTM) web link (bernawil.github.io)
(TXT) w3m dump (bernawil.github.io)
| afropack wrote:
| This is cool. You should add counts to the filters and the result
| list.
| simonw wrote:
| Here's that data with counts on the filters, via Datasette
| Lite: https://lite.datasette.io/?install=datasette-pretty-
| json&jso...
| NickC25 wrote:
| You should also have filters for tech-tangential jobs, such as
| product, operations, design, etc...
|
| Not everyone here is a developer!
| spondylosaurus wrote:
| Seconding this, I'd love a filter for technical writers :)
| bernawil wrote:
| will probably do on a future iteration! The current job
| categories where jus the ones could come up with and then used
| the openAi api to check for conformance (or not) of each post
| to it :)
| simonw wrote:
| Here's the underlying repo: https://github.com/bernawil/hn-who-
| is-hiring
|
| This JSON file has the annotated data:
| https://github.com/bernawil/hn-who-is-hiring/blob/main/src/H...
|
| Since it's JSON on GitHub you can explore using Datasette Lite
| like this:
|
| https://lite.datasette.io/?install=datasette-pretty-json&jso...
|
| Here's an example of a custom SQL query:
| https://lite.datasette.io/?install=datasette-pretty-json&jso...
| bernawil wrote:
| Thanks for this, I didn't know about datasette. Neat!
| mrkstu wrote:
| simonw is the best kind of spammer- he brings his creation
| into the discussion- but he does so in a way that enriches
| the value of what is being discussed.
|
| His tool [Datasette] is of course such that it often is
| immensely handy at dissecting data in useful ways so is often
| exactly on point for a discussion on HN...
| bernawil wrote:
| haha right it's like, how did I make it this far without
| knowing about datasette?
| bugglebeetle wrote:
| There should be a filter for breaking California law by not
| including salary details for companies over 15 people:
|
| https://californiapayroll.com/blog/californias-new-pay-trans...
| bernawil wrote:
| hah not a bad idea at all, watch out for next months update!
| minimaxir wrote:
| What's the AI usage here? From the raw JSON data, it seems you
| wrote a prompt to an LLM to extract structured data from the Who
| is Hiring comments, although I am not sure if that counts as an
| "AI filter" since the filtering criteria are explicitly defined
| beforehand.
| bernawil wrote:
| right, I'm feeding each post to several queries to the openAi
| API. I guess I put "AI filters" so people knew this is actually
| curated by post content and not just a contains() filter so you
| get posts with the text "we don't do remote!" when you select
| the remote checkbox
| superfrank wrote:
| Not knocking the approach, but how do you do quality control
| on the posts? Are you just spot checking? How often have you
| found bad data?
|
| I've thought about doing something similar (using ChatGPT to
| structure and categorize unstructured data) for a different
| project in a completely different space and I'm worried about
| ChatGPT hallucinating things, especially when it comes to
| numbers.
| jonnycoder wrote:
| The quality control is a good question, and one that can
| probably be addressed using evaluation as taught by some of
| the deeplearning.ai short courses (1).
|
| I made an interactive resume ai bot on my personal website
| and there is an instance where I can ask it "tell me about
| your intel experience" and it added in C++ as one of the
| languages, but that is untrue. I had done C++ at a
| different company.
|
| 1. https://www.deeplearning.ai/short-courses/
| superfrank wrote:
| Can you give more details? No offense, but I'm not going
| to sign up for a random site to watch a video of unknown
| quality and length.
| jonnycoder wrote:
| I posted the short courses just as answer to how to
| address quality control. I'm not selling anything, and
| those courses are free anyway. deeplearning.ai was
| cofounded by Andrew Ng, who is probably the most well
| known for his work on teaching machine learning through
| deeplearning.ai, Coursera, Stanford, etc. He has taught
| and influenced millions.
|
| https://en.wikipedia.org/wiki/Andrew_Ng
|
| In regards to "evaluation", I think these is what those
| short courses will cover:
|
| Self-Evaluation with the LLM: The idea is to use the
| language model to generate an answer and then use the
| same or a different model to evaluate that answer. The
| evaluation could involve asking the model to rate the
| answer's accuracy, coherence, relevance, or any other
| desired metric. This self-evaluation process can be
| automated and scaled, although it's important to be aware
| of the limitations, as the model might inherit biases or
| blind spots from its training data.
|
| LangChain for Structured Evaluation: LangChain can be
| used to structure this self-evaluation process. It can
| orchestrate the flow where the LLM first generates an
| answer and then follows a series of steps to evaluate it.
| This might include breaking down the evaluation into
| specific questions or tasks that the LLM must perform to
| assess its initial response.
| bernawil wrote:
| Well to be fair, the original who is hiring post doesn't do
| much quality control. Then, the other apps do neither.
| Honestly, this whole thing came out just of my frustration
| using one of those and filtering for Remote, reading the
| text and finding out it wasn't remote at all.
|
| As for quality control, there's a step for categorization
| that returns some tags. Posts that don't match any are
| rejected, that's kind of filters for relevancy.
| PaulRobinson wrote:
| When providing a huge list of technologies, structure them
| somehow. Alphabetically ordered, for example - I shouldn't have
| to Ctrl-F to find my preferred programming language.
|
| Great idea, just needs some UX love.
| bernawil wrote:
| you're absolutely right, will get it on a next iteration. Just
| for now, know you can filter by your tech on the technologies
| list. Most people are looking for either remote or a specific
| location, and after choosing one of those honestly there are
| not that many posts left to sort through.
| araes wrote:
| I think that's what is being noted. That there's a huge list
| of technologies, and it would be nice if they could be sorted
| by some criteria (alphabetical, old->recent tech, grouped
| (all these are Javascript based), categories (compiled,
| interpolated, data science, graphics/imagery, ...)
|
| Also, is there a way to put in prior "Who is Hiring?" dates?
| If people keep putting out the same listing again and again
| it would be nice to have a way to find. Totally in the Nice
| to Have category.
| bernawil wrote:
| Gosh now I get it! thanks for the clarification, looks like
| I didn't get the point. Totally right.
|
| The search bar does filter the filters list. So if looking
| for "rust": 1. expand the technologies list 2. type rust in
| the search bar 3. select rust option.
|
| But yes, should probably sort it though.
|
| As for old posts, I just overwrite the old data and modify
| the month label just out of laziness. I Will keep posts
| history in future iterations!
| hughdbrown wrote:
| I have a similar-but-hacky command-line app that I put together
| to find just Rust positions:
|
| https://github.com/hughdbrown/who-is-hiring
|
| It's built to be pretty fast by not pulling data it does not
| need. Since it operates in multiple passes on stored data, it
| would be easy to modify/add a pass to get what you want. Feel
| free to use parts you like.
|
| A couple of things I think would help:
|
| - sorted attributes (too hard to go through a hundred computer
| technologies to find Rust)
|
| - multiple geographic entries for the same name (multiple entries
| for Germany, USA, UK, Europe)
|
| - ability to select a month
|
| - if you are showing a static pull of the data, the ability to
| refresh some month would be helpful
| Retr0id wrote:
| It would be nice to have negative filters, e.g. "anything that
| doesn't mention AI"
___________________________________________________________________
(page generated 2024-01-03 23:00 UTC)