Question: Search page: Comparing search modes

milaaraujo is asking a question about general
Follow this topic

by milaaraujo | September 12, 2018 02:40 | #17098


Hey everybody! After working on the API, Stéfanni and I are now trying to improve the Search page (results and design). Recently I opened a PR to add the boolean mode option to the search view. Now I need your help to compare the different search options in the page (https://publiclab.org/search):

image description

'Natural mode' was previously 'best match'. We changed the name, for now, to make clear that it's the natural mode search. After this period of tests, we can choose better names.

Besides saying which one is better, it would be great if you guys could document your search process:

a) what you're looking for, 
b) what search terms you use, 
c) what url you see, 
d) what you see there and how it differs from what you expect, and 
e) what pieces of content (urls) you were expecting to come up in your search

And it might also be good if you could compile a list of keywords you want "good" results for, so we can do additional tuning!

Thank you!

;)



11 Comments

Thank you so excited to see Search Improvements beginning! I search frequently, and i just tried the term epa

I see these Autosuggest results:

Screen_Shot_2018-09-14_at_2.14.50_PM.png

What i like about these results is that it is much more clear than before how the different types of results are indicated. The header in the context menu is very helpful. Instantly i ask myself, why are only two users mentioned? I know that more then those two people have posted about EPA (see https://publiclab.org/contributors/epa). Why is is these two that are shown? Could the context menu explain 2 of X users shown?

Is this a question? Click here to post it to the Questions page.

Reply to this comment...


I think that the Search results should not default to "Date Created" which automatically shows the oldest pages. It's true that some very important pages were created early on, however we've gotten a lot better at organizing wiki pages more recently ;).

When i search for noise using the button (not autocomplete), there is no version of results that places the page https://publiclab.org/wiki/noise on the top two rows of results. So for this search term i have no preferred method of search because none of them give me the main wiki page as the first (or at least top row) of results. Could this be fixed by putting a direct link to a tag page https://publiclab.org/tag/noise at the top of the search results page?

Is this a question? Click here to post it to the Questions page.

Reply to this comment...


I think there are some huge improvements here! I've noticed that in general, the relevance of the search results is much improved. Thanks so much for all of your work on this! Some of the hiccups I've observed are below--

I had a similar experience with a search for Open Hour as Liz did-- none of the initial results lead the lead Open Hour page (https://publiclab.org/openhour), though some lead to invites or recaps of previous Open Hour events. When I search for "Open Hour" (in quotes) the page is eventually listed, but not at the top of the results. I do think it's important that our search help users find wiki pages on a topic and other static pages on the site, since there aren't super intuitive ways to reach all of those pages in other ways via the existing site navigation.

A search for Barnraising brought up some similar issues-- there were notes about some previous Barnraising events in the auto suggested results, but nothing relevant appeared close to the top of any of the results pages other than "likes".

I am not sure I fully understand the "advanced search" options. I think it would be helpful to lead a user to a more typical advanced search form (where a user can select the kind of search they wish to perform and the categories they would like to search, either in a tabbed box or a few search fields with dropdown options) instead of offering a single box with the ability to sort results after the fact-- if we're going to offer a site search function that breaks from convention, either visually or in terms of function,I think we might need to offer some more information about how each of the options work (ie "how to use this search").

Can you explain what "Natural Mode" is and how it works (especially in regards to results ranking)?

The Boolean search seems to lead to an error if I try to use wildcard operators (either no results or an error page), and it doesn't seem to work quite yet with text operators (a search for "barnraising not lumcon" , which I would hope would lead me to notes, about any of our regional Barnraising events (EXCLUDING the ones that occurred at Lumcon) only returned ONE relevant result on the first page of any of the categories (in the "likes" category).

Observation: There seems to be a default set of results that comes up a lot in failed searches, usually including the pages for The Public Laboratory, Gulf Coast,Thermal Photography, About Public Lab, NYC page and a few others across the top, regardless of the search terms or the tab. It seems like search is defaulting to a set of popular pages when other results are slim, but it happens also for terms where there actually IS a lot of relevant content available.

I've also noticed with searches that don't quite compute, the auto-reccomendations list will resort itself a few times (showing tags/categories, etc, but then switching over to a list that is only usernames, many of which are just spam profiles I've visited recently for moderation), or only research notes, etc. Is there a way to get back to the first view once the list has resorted? Are you able to explain the ranking of the profile suggestions? I think the cascading auto suggestions may be trying to offer too much information, but that may be a bigger conversation 😃

Thanks for all of your hard work on this!

Is this a question? Click here to post it to the Questions page.

Reply to this comment...


view_all_redirect_to_search_all.png

Reply to this comment...


search_page.png

Reply to this comment...


results_by_type.png

Reply to this comment...


Hi, Liz, thank you for your feedback!!

The main problem is that the way the search is working now is really confusing, even for us! We are trying to improve it a little, but it's a big work in progress. The typeahead and the /search are not using the same API methods, for example. We are working to consolidate the search in general, to at least return more relevant or consistent results.

But one important thing to keep in mind is that if a certain result is at the top or not of results list returned, will depend on the relevance of the page.

From your feedback, we believe the search default should be the boolean mode? BUT right now what we need a more clear feedback is about the boolean or the natural mode. You will still be able to change the mode of the search, we just need to choose one as the default for now.

Take a look at some screenshots that we are still working on (above) :)))

Is this a question? Click here to post it to the Questions page.

Reply to this comment...


Hey @Bronwen,

As Stéfanni explained to Liz, we are still working on the search. Many of the problems you mentioned will be solved - we hope so ;) - with some of the other improvements we are working on! Right now, with the actual version in production, our aim is just to test natural mode x boolean mode x likes x etc... When we finish the new version of the search we will write a post explaining everything we did and we will try to address all the concerns you shared with us here.

Can you explain what "Natural Mode" is and how it works (especially in regards to results ranking)?

So, we are using the MySQL full-text search: https://dev.mysql.com/doc/refman/5.5/en/fulltext-search.html.

"A natural language search interprets the search string as a phrase in natural human language (a phrase in free text). There are no special operators, with the exception of double quote (") characters. The stopword list applies. In addition, words that are present in 50% or more of the rows are considered common and do not match. [...] A boolean search interprets the search string using the rules of a special query language. The string contains the words to search for. It can also contain operators that specify requirements such that a word must be present or absent in matching rows, or that it should be weighted higher or lower than usual. Common words such as “some” or “then” are stopwords and do not match if present in the search string. The IN BOOLEAN MODE modifier specifies a boolean search."

About the wildcard (https://dev.mysql.com/doc/refman/8.0/en/fulltext-boolean.html):

"The asterisk serves as the truncation (or wildcard) operator. Unlike the other operators, it is appended to the word to be affected. Words match if they begin with the word preceding the * operator."

I'm concatenating the '*' with the query directly in the code because I don't know if people, in general, know how to use it. My code: Desenho_sem_título_(4).png

Is this a question? Click here to post it to the Questions page.

Reply to this comment...


Its really very nice blog! thanks for sharing.

This article is useful for know about the root security of the containers. I got the knowledge of working of containers with the root behavior with the kernel run 3


Reply to this comment...


Its really very nice blog! thanks for sharing.

Reply to this comment...


Log in to comment