All I want for Solr 10 is...
My wish list for what we need to complete to trigger the release of the next major version of Solr...
Of course it starts with an upgrade to the latest Lucene…. I’m looking forward to a 10.1 version of Lucene with all the early adopter bugs worked out ;-).
The rest of the list, in no specific order:
The MVP of the new Admin UI! Something Christos has been working on, it’s a Kotlin based Admin UI that runs in both desktop and in the browser. It’s totally separate from the existing Admin UI, so it’s a chance to re-imagine what the user interface to Solr can be! I for one am also excited to have a reason to dig a bit deeper into Kotlin.
Channel our inner Marie Kondo and thank Hadoop for its service. We removed in 10 the Hadoop Auth module that provided Kerberos based authentication. Now I’d like to take a hard look at the Hadoop HDFS module as well. Our Survey, while not super scientific, didn’t have anyone who said they use the module.
Finish the Jetty 12 migration! Sanjay has been doing awesome work on getting a core dependency upgraded. Let’s not fall behind again?
Related to the Jetty 12 (maybe?) is the migration from Apache Http Client to the Jetty Http Client underlying our Java client. This has been a huge effort led by Sanjay and David, and I am really excited to see fewer Java classes related to this.
Java File —> Path migration in our source code. This is a nice bit of updating, and once we get it done we’ll have fewer conversions between these approaches. Thanks Matthew for this!
Finish up writing the “This is how you SHOULD deploy Solr at various scales” ref guide documentation. Now that Cloud is the default mode, we should provide more advice to users on this topic.
Remove the Java Security Manager from Solr. It no longer works in JDK24 anyway, and so we should remove it to make remove for the future.
While we are looking at things that are meant to improve security but don’t, let’s strip out the “trusted” flag for configsets. While a good idea, the implementation is complex and we don’t think it really improves our security posture compared to other approaches.
By the way, this query will show you the blockers already identified: https://issues.apache.org/jira/issues/?jql=project%20%3D%20SOLR%20AND%20priority%20%3D%20Blocker%20AND%20status%20%3D%20OPEN and we are currently at 10 identified blockers.
I was thinking about the state of our V2 api’s, and I think that they are an ongoing effort through the life of 10x. We finally have enough published that anyone building on Solr SHOULD use them first, and only go back to a v1 if it’s missing. I suspect over the life of Solr 10 we will be slowly swapping out V1 API calls in favour of V2 equivalents in our clients, CLI tooling, Admin tooling, etc.
I also think that any major changes to how SolrCloud works will be an ongoing effort for 10x, not something that should prevent 10.0 from being released.
A reach goal for me for 10 would be rethink Solr’s extraction module to leverage the Tika Pipes work to be able to run a Solr and seamlessly have documents posted to Solr to be processed by an external separate Tika process. This should reduce the number of dependencies in Solr 10 massively! And make it much less of a burden for us to upgrade to the latest versions of Tika.
Another reach goal is to get some more robust benchmarking of performance in place, akin to the Lucene ones. We have a prospective client that may fund this work, let’s hope we get through contracts and this is something we can put something small in place early in 2025 ;-).
With this, we’ll have a leaner simpler to maintain codebase that increases developer happiness, and then we can look at adding all the new cool capabilities (looking at the llm
module for example from Alessandro and crew) in future 10x releases.
So? March 2025?