Continuing with my previous research notes [1, 2], which discuss about the whereabouts of my project, this note describes the project implementation highlighting the various features that are provided:
I would like to let you know that I have successfully completed the project and integrated it into the Spectral Workbench and am eagerly waiting for an official announcement!
What I wanted to do
A Scalable Spectral Matching Mechanism. With this in place, the users will be able to see some results which show up when the system finds some similar spectra in the database. This helps the user explore and learn more about his uploaded spectrum.
I now introduce my implementation of such a system which I wrote for the Spectral Workbench, with the help of my mentor, Jeffrey Warren, supported by Google's Summer of Code 2013!
How it works?
You will now see a 'Find Similar' button on all the analyze pages (something like spectralworkbench.org/analyze/spectrum/spectrum_id) of the spectra.
On clicking this, you will be taken to the matching interface (which will be: spectralworkbench.org/match/search/spectrum_id). This interface will be used to interact with the database for finding closest matches.
As you will see, there will be some results already, and the graph on the page shows all those matches. You can click on the remove link beside any spectrum on the graph to clear it from the graph. Also, you have a 'Clear plot' button to clear the entire graph, so that you can compare the results as you wish. For that purpose, you can click on the good old 'Compare' button which will be preloaded with the results. If you want to compare with some spectrum which is not listed as a match, you can search for it using the search option and compare it!
Now, we have something called 'fit'. This determines how close the returned matches are to the main spectrum in the question. The lower the fit, the closer is the match.
In the above image, the width of the search band is what we are referring to by 'fit'. For more details about the method, please refer to my note here.
Now, lets see how the 'fit' parameter changes the results.
Consider the matching page for spectrum: 431 as shown below.
This page is showing that fit = 90. This has been automatically selected by the system to display a good number of matches. You can always change this and see the results change in the graph. As simple as that. So lets go ahead and see how this changes the results.
I obtain this when I change the fit parameter to 85:
See that some matches which were in the previous image aren't seen here! Now lets change the fit to 100 and see how it works.
Simple. Isn't it? If you would like to, you can even go ahead an click on "Save as set" button and it will be saved as a new set of spectrums!
What can this do? And what it can't?
As described in my previous note  it searches for the matches in the close vicinity of the spectrum, both above and below it. But to account for the x-shift problem, where the spectrum may be shifted in the +ve or -ve X-direction, i.e., either to left or right of the expected position (this may be mostly due to differences capturing conditions), I have averaged the relative intensity values for every overlapping bins so that the curve gets smoother.
Due to the averaging, and higher fit parameter there will be some false positives reported. Like this one: Finding Neon by Chris Fastie. We still need to use some filtering techniques for the matches (something like peak matching/counting etc.,). But without deciding what to do for it, we want to collect details about some other issues with the system and we plan to solve them appropriately.
Also, as straylight pointed out (commented on this post), the matching algorithm only works for the calibrated spectra and only searches for the closest matches among those calibrated ones.
Yes. I take this section to introduce you to another special feature -- Live Matching (which is still in prototype level with much accuracy on the way).
From now, you will see a "Start Live Matching!" button on the capture page. Like this one
After clicking on it, you will be shown the closest matches the system is able to for the spectra you are just about to capture. This opens, in my opinion, way to various interesting experiments!
If there are some matches, then something like this will be seen:
If there are no matches, then you will see:
Sorry if you keep on seeing "No matches found" message. This feature will be improved in the days to come. If you feel that this feature is distracting you from your work, you can click on 'Stop' link displayed with a message like "Refreshing in 5 secs (Stop)" and everything will be as it was before!
What has changed from the previous note?
In the previous note, which can be found here, I introduced a working prototype of the system and we received a great response and suggestions. I followed those and implemented them accordingly -- including narrowing down the bin size, using all the data points available, making use of overlapping bins etc.
Yep! The project has come to an end. All's well that ends well.
I had no idea of what I was about to face when I took up this project. I realized sooner that this is no Image Processing problem (see my note here) and was lucky enough to come up with an approach to do this mathematically. Now, I know what a spectrum is. I know what calibration means. And most importantly I contributed a small feature to the scientists, teachers, amateur physicists, students out there who are interested in spectral analysis.
I was and am always excited about this project, but many times was discouraged by my experiments which were failing horribly. Thanks to my mentor, Jeffrey Warren who always motivated me to do more and in a correct way. Thanks Jeff.
A community is what makes opensource so special! And I am very lucky to be backed by some of the most interesting and innovative people. I enjoyed taking suggestions from them.
Thanks Dave, who made things very tough for me in the initial days (as I was unable to understand when he said something like bin, over exposure), but now, I enjoy exchanging emails with him on a regular basis about my project! His contributions to the project are invaluable.
Special thanks to Chris Fastie and Nathan McCorkle for suggesting features and helping me find out various bugs. Thanks to Bob, who offered a helping hand along with Jeff and Dave during my pre-GSoC period to help me structure my proposal. Also special thanks to the Earthquake Bolt Barnstar, Liz Barry, who was quite active and for mentioning "Sreyanth - Developing killer features for spectral workbench" in a presentation about Public Lab. Thanks to the Dev Manager, Becki Chall who acted promptly by forwarding various details about GSoC deadlines and updates as a GSoC Administrator for Public Lab. Thank you guys for making my first ever GSoC a wonderful experience!
Also, my sincere apologies for some recent bugs which unintentionally popped up (that have been reported recently) due to my code edits. Sorry for the inconvenience caused.
And, last, but not the least, thanks to you who patiently read this lengthy note! Should you have any issues, queries or suggestions, please feel free to contact me at firstname.lastname@example.org.