Status Update
Comments
ge...@gmail.com <ge...@gmail.com> #2
Note that Gerrit version 3.8.3
is free from this issue
lu...@gmail.com <lu...@gmail.com> #4
This is a P0 because it breaks the offline reindex.
lu...@gmail.com <lu...@gmail.com> #5
See the off-line reindex log that breaks the change index:
$java -jar bin/gerrit.war reindex
[2024-01-25 20:07:11,446] [main] INFO com.google.gerrit.server.git.SystemReaderInstaller : Set JGit's SystemReader to read system config from ./etc/jgit.config
[2024-01-25 20:07:11,469] [main] INFO com.google.gerrit.server.git.LocalDiskRepositoryManager : Defaulting core.streamFileThreshold to 2047m
[2024-01-25 20:07:12,236] [main] INFO com.google.gerrit.server.git.WorkQueue : Adding metrics for 'WorkQueue' queue
[2024-01-25 20:07:12,243] [main] INFO com.google.gerrit.server.cache.PersistentCacheBaseFactory : Enabling disk cache /Users/lucamilanesio/gerrit-3.9/cache
[2024-01-25 20:07:12,278] [main] INFO com.google.gerrit.server.git.WorkQueue : Adding metrics for 'Index-Interactive' queue
[2024-01-25 20:07:12,281] [main] INFO com.google.gerrit.server.git.WorkQueue : Adding metrics for 'Index-Batch' queue
[2024-01-25 20:07:12,384] [main] INFO com.google.gerrit.server.git.WorkQueue : Adding metrics for 'ReceiveCommits' queue
[2024-01-25 20:07:12,384] [main] INFO com.google.gerrit.server.git.WorkQueue : Adding metrics for 'SendEmail' queue
[2024-01-25 20:07:12,580] [main] INFO com.google.gerrit.server.rules.prolog.PrologEnvironment : reductionLimit: 100000, compileLimit: 1000000
[2024-01-25 20:07:12,792] [main] INFO com.google.gerrit.server.plugins.PluginLoader : Loading plugins from /Users/lucamilanesio/gerrit-3.9/./plugins
[2024-01-25 20:07:12,795] [main] WARN com.google.gerrit.server.project.PeriodicProjectListCacheWarmer : project_list cache warmer is disabled
Collecting accounts: 1
Reindexing accounts: 100% (1/1)
Reindexed 1 documents in accounts index in 0.3s (3.3/s)
Index accounts in version 13 is ready
Reindexing groups: 100% (2/2)
Reindexed 2 documents in groups index in 0.2s (9.7/s)
Index groups in version 10 is ready
[2024-01-25 20:07:13,386] [main] INFO com.google.gerrit.server.git.MultiProgressMonitor : Reindexing changes: Slicing projects: 66% (2/3) [CONTEXT ratelimit_period="1 MINUTES" ]
Reindexing changes: changes: 100% (1/1), project-slices: 100% (1/1), Slicing projects: 100% (3/3), done
Reindexed 1 documents in changes index in 0.1s (7.8/s)
Index changes in version 84 is ready
Reindexing projects: 100% (3/3)
Reindexed 3 documents in projects index in 0.1s (30.9/s)
Index projects in version 8 is ready
[2024-01-25 20:07:14,160] [ForkJoinPool-1-worker-1] WARN org.eclipse.jgit.internal.util.ShutdownHook : Cleanup org.eclipse.jgit.util.FS$FileStoreAttributes$$Lambda$70/0x00000001460ba2c8@5c671d7f during JVM shutdown
th...@gmail.com <th...@gmail.com> #6
I am sorry that my change causes such an issue. I don't have my MacBook available right now, so I can't do a proper analysis right now. I will do so asap. My current hypothesis is that the indexer detects that the change is in the index and skips it. There should be a log line telling that if using the --verbose flag. Iirc correctly, offline reindexing was meant to always force reindexing. I am also not sure why the change is gone, since deciding to skip the index involves a check whether a change is present in the write index with the highest version number. I need to debug to find out more.
lu...@gmail.com <lu...@gmail.com> #7
I'm investigating now and will share more insights here shortly, hopefully those would help finding the culprit.
th...@gmail.com <th...@gmail.com> #8
From reading the code: The reindex pm tool gets the search index and calls deleteAll() on it. Then it calls SiteIndexe.indexAll() with that index. The check whether to skip indexing checks in the latest write index whether the change is available. The license docs state that the deletion by IndexWriter.deleteAll() is only visible after calling commit(), which we don't do as far as I can see. So Gerrit might think the entry is still there, although it has been deleted, and than skip to index the change again, which results in an empty index, if all changes were already present.
lu...@gmail.com <lu...@gmail.com> #9
The license docs state that the deletion by IndexWriter.deleteAll() is only visible after calling commit(), which we don't do as far as I can see. So Gerrit might think the entry is still there, although it has been deleted, and than skip to index the change again, which results in an empty index, if all changes were already present.
Yes, that explains. Also, if you re-run the offline reindex they all appear again. The index is empty, then the reindex will re-create them.
P.S. To me we just need to force the reindex during offline reindexing: there is no point in trying to search for the existing ones, isn't it?
th...@gmail.com <th...@gmail.com> #10
Yes. And as I wrote in an earlier comment, I thought that was what I had done. However, I created the change a year ago, so obviously I remember it incorrectly. If we anyway delete the index before indexing, there is no sense behind looking for existing changes in the index.
The change was done having online reindexing in mind. Back then we had quite some issues when indexing changes of some large projects, which crashed Gerrit. Since indexing back then took 1-2 weeks that cost us a lot of time. That is why we wanted to skip everything that was already indexed.
lu...@gmail.com <lu...@gmail.com> #11
P.S. To me we just need to force the reindex during offline reindexing: there is no point in trying to search for the existing ones, isn't it?
I tried this approach but it would break compatibility with the indexer interface, which is unwanted in a stable branch. I'd propose to just auto-enable the force when the source and destination index have the same version, hence we are reindexing on the same index.
lu...@gmail.com <lu...@gmail.com> #12
@Thomas I've proposed a workaround for stable-3.9 at
On master, we need a different solution: the forcing of reindex should be explicit IMHO, but I'm open to suggestions and other approaches.
th...@gmail.com <th...@gmail.com> #13
Since the workaround would have effectively deactivated the feature to skip existing changes, Luca and I decided to revert the change instead.
th...@gmail.com <th...@gmail.com> #14
I will work on fixing this issue on Gerrit master.
ap...@google.com <ap...@google.com> #15
Branch: master
commit ce49afb24e97e600a48dfc0f68f2014d20bdb197
Author: Luca Milanesio <luca.milanesio@gmail.com>
Date: Fri Jan 26 08:53:41 2024
Revert "During online reindexing of all changes skip changes already present"
This reverts commit 366ae421f263acf201adbcc1f13c69f03fef40cf.
Reason for revert: Broke the offline reindexing (see
Change-Id: Ifa90a4eddb8b6eccc2bee7d99757e478fd983b1e
M java/com/google/gerrit/server/index/OnlineReindexer.java
M java/com/google/gerrit/server/index/change/AllChangesIndexer.java
M java/com/google/gerrit/server/index/change/ChangeIndexer.java
M java/com/google/gerrit/server/index/change/IndexedChangeQuery.java
ap...@google.com <ap...@google.com> #16
Branch: stable-3.9
commit 7a127d54a94b790a0d078df337420b8895ef8e33
Author: Luca Milanesio <luca.milanesio@gmail.com>
Date: Fri Jan 26 08:53:41 2024
Revert "During online reindexing of all changes skip changes already present"
This reverts commit 366ae421f263acf201adbcc1f13c69f03fef40cf.
Reason for revert: Broke the offline reindexing (see
Bug:
Release-Notes: Fix the disappearance of all changes after performing a full reindex
Change-Id: Ifa90a4eddb8b6eccc2bee7d99757e478fd983b1e
(cherry picked from commit ce49afb24e97e600a48dfc0f68f2014d20bdb197)
M java/com/google/gerrit/server/index/OnlineReindexer.java
M java/com/google/gerrit/server/index/change/AllChangesIndexer.java
M java/com/google/gerrit/server/index/change/ChangeIndexer.java
M java/com/google/gerrit/server/index/change/IndexedChangeQuery.java
[Deleted User] <[Deleted User]> #17
Och l
[Deleted User] <[Deleted User]> #18
كحبه
Description
Step to reproduce
java -jar gerrit.war reindex
Expected result
C1 is in the change list
Observed result
C1 is not in the list of changes. When indexing by project only, for example
ssh -p 29418 admin@localhost gerrit index changes-in-project P1
, this doesn't happen.Doing an offline reindex multiple times will create and clobber the change index alternatively.
This change was recently introduced in the indexing code. I haven't reverted it yet to test if that's the one introducing the regression.