Consider the following two URLs:
jar:file:/test.jar!/BOOT-INF/classes!/foo.txt
jar:file:/test.jar!/BOOT-INF/classes/foo.txt
They both reference the same foo.txt file in the BOOT-INF/classes
directory of test.jar, however the first URL does so via the
nested BOOT-INF/classes archive. Previously, this difference in the
URLs would lead to PathMatchingResourcePatternResolver returning two
resources for foo.txt when asked to find all resources matching the
pattern classpath*:/**/*.txt.
This commit updates our Handler that is used for jar: URLs to consider
the two URLs above to be equivalent such that url1 is equal to url2
and the two urls will produce the same hash code.
Closes gh-7449
Previously, if Boot's JarURLConnection pointed to the root of a nested
entry, e.g. /BOOT-INF/classes, a call to getInputStream() would throw
an IOException. This behavior is reasonable for a URL that points
to the root of a normal jar as the jar itself is on the class path
anyway. However, for a nested jar it meant that a call to
ClassLoader.getResources("") would not include URLs for any nested
jars and directories (/BOOT-INF/classes and jars in /BOOT-INF/lib).
This is due to some logic in URLClassPath.Loader.findResource that
verifies a URL by opening a connection and calling getInputStream().
The result of missing URLs for the root of nested jars and directories
is that classpath scanning that scans from the root (not a good idea
for performance reasons, but something that we should support) would
not find entries in /BOOT-INF/classes or in jars in /BOOT-INF/lib.
This commit updates our JarURLConnection so that it no longer throws
an IOException when asked for an InputStream for the root of a nested
entry (directory or jar).
Fixes gh-7003
This commit restores the logic in Handler that was changed when
d20ac56a was merged, while leaving the structural improvements intact.
In addition to a couple of changes where a typo meant the wrong
variable was being referenced, some logic branches now return false
rather than called super. This realigns our Handler's behaviour with
that of the JDK's.
Some more tests have also been added to try to catch the problems that
were introduced during the merge.
Closes gh-7021
Previously our handler didn't override parseURL or sameFile which
resulted in behaviour that differed from that of the JDK's handler.
Crucially, this would result in our JarURLConnection being passed
a spec that didn't contain a "!/". A knock-on effect of this was
that the connection would point to the root of the jar rather than
the intended entry.
Closes gh-7021
URL.getContent() is shorthand for URL.openConnection().getContent().
It creates an InputStream that isn't explicitly closed. This means
that a file handle remains open until the URLConnection is garbage
collected. This can lead to the process exceeding the limit for open
files.
Previously, LaunchedURLClassLoader was using getConent() when
proactively defining a package for a class that is about to be loaded.
getContent() was used to access nested jar files to check if they
contained the package and, if so, to retrieve the jar's manifest.
In place of using getContent(), this commit uses JarURLConnection's
getJarFile() method which provides access to the JarFile without the
unwanted side-effect of opening an input stream.
Closes gh-7180
Previously, RandomAccessDataFile used a semaphore and acquired it
interruptibly. This meant that an interrupted thread was unable to
access the file. Notably, this would prevent LaunchedURLClassLoader from
loading classes or resources on an interrupted thread.
The previous commit (937f857) updates RandomAccessDataFile to acquire
the semaphore uninterruptibly. This commit adds a test to
LaunchedURLClassLoader to verify that it can now load a resource from
an interrupted thread.
Closes gh-6683