This weekend I had a chance to play a bit with the SIMD instructions of a variety of architectures (in particular the Playstation3 Cell BE and my laptop Intel Centrino vPro machine). I think I'll share the results of those tests in a different article, but for now I just want to talk about the different solutions (and especially their drawbacks) that can be used to measure the time passed while executing a critical section of code.
While micro optimisations and micro benchmarking are usually evil for a number of reason, from time to time it's interesting to measure the performance of a chunk of code, even when the chunk is indeed small, either for comparison or to collect statistics of performance critical sections. Other times instead it's mandatory to have such measurement in order to ensure that a critical section of code runs at a constant (or at least stable) speed. One example of this that comes to my mind right now may be the main game loop of a video game.
All of the following methods measure the time elapsed since a known "checkpoint" in time, so they all require you to grab the measure two times, one at the beginning and one at the very end of the measured section of code, and subtract the two values. The difference lies in the way this operation is performed (for example some methods use some specific structures and you need to subtracts the values of their fields), on the resolution and on the accuracy of the measurement. Not all methods are reliable in all cases for example, or the resolution may be too small to give correct results for a given problem.
Using time
int time(time_t *calptr);
The funtion time it's declared in time.h. This function has only one second resolution, measuring time in seconds elapsed since the Epoch (oh, I wanted to write this since ever!), there are not many real use cases I think, but probably is the most portable method across the various Operating Systems.
Using gettimeofday
int gettimeofday(struct timeval *restrict tp, void *restrict tzp);
The gettimeofday function is a BSD derived function declared in sys/time.h. As it sounds, you may guess that it's portable across the various Operating Systems, because it's POSIX and basically it's there in all BSD/HP/Linux/QNX/Whatever system you have, but the problem is that it's resolution is variable. In fact timeval contains a field that supports microsecond resolution (one millionth of a second) but whose granularity maybe not of 1 ms (and in fact it's not most of the time).
Reading directly the high resolution time register
This is the least portable way, as it relies on a specific instruction to access the high resolution time register of the target CPU. Not only, the concept of "high resolution time register" is fluff. ARM, as far as I know, doesn't have it. Recent PowerPC have a cool register called "Time Base" register and an associated assembly instruction, mftb. This is the case for example in the Playstation 3 Cell BE (which is probably my favourite architecture) Power Processing Unit, but if you want to profile the SPU you don't have access to this register directly. For the SPU elements you can use the SPU Decrementer and its associated instrinsics instead. This method is non portable also for older PowerPC, because they do not have the mftb instruction at all (and those older architecture version are very common in the embedded world). In this case you need to access two special purpose registers, TBL and TBU using the mfspr instruction.
The Intel version of such register is called "Time Stamp Counter" and it's assembly is rdtsc. This one is available from the Pentium processor family and on, but you still have to take care to detect it on Intel compatible systems. Also on some CPU it suffers from problems when power saving is activated, it may give incorrect results when out of order execution is performed, or in general when threads are being run on multicore/multiprocessor systems.
I'm afraid of pasting m$ related stuff, and even linking it from my blog is some sort of a sin, but this page has some details about the issues, so it's a good read after all.
One way to fix the out of order execution problem is to call the cpuid instruction, because it forces every pending instruction to be executed before it is called, with the effect of actually disabling out of order execution.
This instrinc could be used to read the Time Stamp Counter:
static inline unsigned long long rdtsc(void)
{
unsigned hi, lo;
__asm__ __volatile__ (
"xorl %%eax, %%eax\n"
"cpuid\n"
"rdtsc\n"
: "=a" (lo), "=d" (hi)
: /* no input operands */
: "%ebx", "%ecx");
return (unsigned long long) hi 32 | lo;
}
The unsigned long long value returned by reading the time register is usually measured in ticks elapsed since last time the CPU got resetted, so to get microsends out of ticks you need to know the frequency the clock is spinning (for example by reading /proc/cpuinfo).
See what I mean for non portable?
So, this is a good idea on some platform and a bad idea on others, although generally works fine if you do it correctly, but in my opinion it's quite a pain unless you do very specific things.
Using clock_gettime
int clock_gettime(clockid_t clk_id, struct timespec *tp);
Last but not least the method that I think is the most accurate while maintaining a good deal of portability. clock_gettime is again a Posix function, but it depends on a separate library, rt, in order to be linked (let's say like this, it's not linked as part of the standard libc linkage like gettimeofday is for example). Also, gcc doesn't like it if you specify the -c99 option, so you need to define _POSIX_C_SOURCE=199309L as well.
Once this is done, the function is declared in time.h. The resolution is nanoseconds although the exact precision can be queried by using another function, clock_getres. They all take as first argument the clock id; it is in fact possible to specify a different type of clock to query, although again the number of clocks supported are system dependants.
Most of the time we are probably interested in querying either the CPU time or the Thread time, using CLOCK_PROCESS_CPUTIME_ID or CLOCK_THREAD_CPUTIME_ID, as far as I know this function is free from most of the problems that affect the other methods.
The timespec struct is similar to the timeval one defined for gettimeofday:
struct timespec {
time_t tv_sec; /* seconds */
long tv_nsec; /* nanoseconds */
};
One common operation, as said before, is to subtract between two timespec values before printing the result, this function may be used (I derived this from the gcc manual):
static inline void timespecSubtract(struct timespec *result,
struct timespec *start,
struct timespec *stop)
{
if (stop->tv_nsec start->tv_nsec) {
result->tv_sec = stop->tv_sec - start->tv_sec - 1;
result->tv_nsec = 1000000000 + stop->tv_nsec - start->tv_nsec;
} else {
result->tv_nsec = stop->tv_nsec - start->tv_nsec;
result->tv_sec = stop->tv_sec - start->tv_sec;
}
}
As with gettimeofday, we store the elapsed time in second in the tv_sec member, and the additional nanoseconds in the tv_nsec member.
Conclusion
Even a simple task as measuring elapsed time is more tricky than one could ever expect, but despite the million different ways of doing this, each with its own drawback, benefit and corner cases, it's definitely possible to still write portable code by abstracting this functionality. I hope I gave a good overview of some of the most interesting options. Also, the Internet has as always a great deal of documentation about this topic in case you want to know more on the subject.
For example I found a good discussion about clock_gettime here, where the author gives a similar function in order to subtract timespec values. Wikipedia has an interesting entry abut the Time Stamp Counter. The IBM Developer Works website is as always a very great source of information, and has a good article about the Time Base Register.
As always, man is your friend. The Linux man page entries for the functions presented are a great starting point (and most of the time all you need) to understand the functions under discussion. The HP and Apple developers sites are also very good, as they sometimes highlight specific issues that can help coding with portability in mind. Finally, Stackoverflow is another website that is particularly helpful to find answer to common (and at times not so common) programming related questions.
As of build 85 of JDK 7, bug 6634138 "Source generated in last round not compiled" has been fixed in javac.
Previously, source code generated in a round of annotation processing where RoundEnvironment.processingOver() was true was not compiled.
With the fix, source generated in the last round is compiled, but, as intended, while compiled such source still does not undergo annotation processing since processing is over.
The fix has also been applied to OpenJDK 6 build 19.
In annotation processing there are three distinct roles, the author of the annotation types, the author of the annotation processor, and the client of the annotations. The third role includes the responsibility to configure the compiler correctly, such as setting the source, target, and encoding options and setting the source and class file destination for annotation processing. The author of the annotation processor shares a related responsibility: property returning the source version supported by the processor.
Most processors can be written against a particular source version and always return that source version, such as by including a @SupportedSourceVersion annotation on the processor class.
In principle, the annotation processing infrastructure could tailor the view of newer-than-supported language constructs to be more compatible with existing processors. Conversely, processors have the flexibility to implement
their own policies when encountering objects representing newer-than-supported structures.
In brief, by extending version-specific abstract visitor classes, such as
AbstractElementVisitor6 and
AbstractTypeVisitor6, the visitUnknown method will be called on entities newer than the version in question.
Just as regression tests inside the JDK itself should by default follow a dual policy of accepting the default source and target settings rather than setting them explicitly like other programs, annotation processors used for testing with the JDK should generally support the latest source version and not be constrained to a particular version. This allows any issues or unexpected interactions of new features to be found more quickly and keeps the regression tests exercising the most recent code paths in the compiler.
This dual policy is now consistently implemented in the langtools regression tests as of build 85 of JDK 7 (6926699).
I’m back on Shark, after a four month hiatus. A minor milestone: it can build itself again.
With a duo of fixes in JDK 7 build 85, one by Jon
(6927797)
and another by me (6926703), the langtools repository has reached another milestone in testing robustness: all the tests pass with assertions (-ea) and system assertions (-esa) enabled. This adds to other useful langtools testing properties, such as being able to successufully run in the speedy
same vm testing mode.
Jon's fix was just updating a test so that some code would always be run with assertions disabled while my fix corrected an actual buggy assert I included in apt. Addressing such problems helps simplify analyzing test results; if there is a failure, there is a problem!
These fixes have also been applied in the forthcoming OpenJDK 6 build 19 so it too will have the same assertive testing quality.
Or “things you would do to not use Windows”...
Here is the recipe:
1. A decent Operating System with a sane Desktop Environment (in other words Fedora 12 + Gnome because it has gvfs, which is the coolest thing in the World)
2. VPN/SSH access to a Linux host that has a share to all the Windows based toolchain.
3. Some scripts to automate compilation for the test files (in my case, all the usual make machinery for Jamaica).
4. A Windows shell that is used only to invoke the build machinery (and only that!).
5. Samba/NFS to share the sane Linux directories with the borked Windows machine.
6. FTP whatever to upload the OS9 binaries to the target device.
7. MAUI and OS9 documentation.
Preparation (for 1 person):
Just open a connection with nautilus to the remote Linux machine. On a separate terminal ssh into the same Linux machine and link the toolchain header files directory on a directory that is local to the mounted home (ugh, that‘s a complex sentence). This is needed only if the toolchain location is not on the local machine (like in my case is mounted via nfs), the reason is that the gvfs share will only see the remote Linux machine but not all that is mounted remotely there, so the trick of linking is necessary. Create the project in NetBeans telling it that the location of the header files is the shared folder that was made visible via gvfs. Edit code as necessary. Now the tricky part. When you compile huge amount of code gvsf crashes because NetBeans polls every file that is modified and make modifies them (it set the timestamps!) so never, ever ever, never use the same virtual desktop to edit code and to issue the make command. Just go back coding when make finishes (it‘s windows, so is supposed to be slow). Remember this again, don‘t focus NetBeans when windows compiles code. I think this is a weird bug in gvfs, and hopefully it will be fixed. If gvfs crashes, you need to restart and reaload the project in NetBeans, which takes more time that waiting for the compilation to finish… Ah, make the customers that provide a windows only toolchain to pay twice is a good thing to do also.
End result: Jamaica runs on OS9 with MAUI :)
When making a programming tool or a virtual machine getting the tool running perfectly stable without any crash bugs are always on a higher priority than gaining more speed. A crashing tool are a broken tool so I will share some tricks that I have practised to find and fix Shark LLVM JIT CodeGen crash bugs. The main trick are to be able generate reproducable testcases that can be reported to the LLVM developers bugzilla bugtracker by using what you can extract from the Shark LLVM JIT CodeGen crashes. Here is how I do it, enjoy!
How to provoke hard to find Shark LLVM JIT bugs
Some Shark LLVM JIT bugs are hard to find because they only occour after the Shark JIT enabled JVM have been running for a long time, this are because the Shark Hotspot JVM takes advantage of the fact that a given running application spends about 90% of its time running only 10% of the applications code. Hotspot profiles the running code and only JITs the most frequently used methods of the program. Hotspot uses a threshold to determine which methods to JIT. When a method have been used more than 100000 times then it are scheduled to be optimized by the JIT. JIT bugs can stay undetected if they are located in unfrequently executed methods, those methods that makes up the 90%, of the unfrequently executed application code.
A easy trick to provoke unfrequently executed JIT bugs are to lower the JIT threshold in Hotspot so that Hotspot JITs everything. The JIT threshold can be controlled by using the -XX:CompileThreshold=1 option and -Xbatch option. -Xbatch prevents the hotspot from running the JIT in background and will make hotspot reproduce JIT bugs more determistic.
Using a low JIT threshold will of course make the program startup magnitudes slower but it will also eventually find and hit all JIT bugs for a given application. Try pass -XX:+PrintCompilation to Hotspot as well so that you can observe all the java methods that Hotspot are JITting and find out which method that failed to JIT if Hotspot hits a JIT crash bug.
java -XX:CompileThreshold=1 -Xbatch -XX:+PrintCompilation JavaApplication
1 b java.lang.Thread:: (49 bytes)
…
10 b java.lang.String::getChars (66 bytes)
*crash*
/home/xerxes/llvm/include/llvm/CodeGen/MachineFrameInfo.h:289: int64_t
llvm::MachineFrameInfo::getObjectOffset(int) const: Assertion
`!isDeadObjectIndex(ObjectIdx) && “Getting frame offset for a dead object?”‘
failed.
Huh.. no logfile??
Most Shark LLVM JIT CodeGen crash bugs makes the JVM instantainiously exit without producting a hs_err_pid*.log file. Whats usefull are that the JVM output will contain a Assertion, Unreachable or Unimplemented keyword and a LLVM code line numer.
So what do we do now?
Thanks by using -XX:+PrintCompilation makes us aware that the last method JITed was the java.lang.String::getChars method and that caused the Assertion in the LLVM CodeGen when running the Shark JIT so the next step are to dump the LLVM IR that Shark have generated for the method.
Extract the LLVM IR for the java method that makes the Shark JIT crash.
Ok so we got a crash and we know that it was JITing of java.lang.String::getChars that caused it.
a) shark debug build -XX:SharkPrintBitcodeOf= method:
If you have built a debuggable “Mixtech” Shark build then Shark will contain some extra usefull debug runtimeoptions where one of the more usefull are
-XX:SharkPrintBitcodeOf=java.package.name::MethodName
use it and Shark will dump the LLVM IR bitcode to stdout just before jitting it.
b) gdb call F->dump() method:
I personally prefer dumping LLVM IR from inside the gnu gdb debugger since this method can be used using release Shark build in combination with release llvm builds so lets jump into the gdb debugger!
Start gdb and attach it to the java application with all the options that triggered the JIT CodeGen bug!
$ gdb -args java -XX:CompileThreshold=1 -Xbatch -XX:+PrintCompilation JavaApplication
(gdb) run
...
Segmentation fault
$
Ick gdb crashed why? This are because the JVM launcher “java” first sets up the system environment and then forks off in a new process using execve(). gdb gets killed by the linux kernel when it are trying to read memory across process boundarys so we must stop java from forking!
The easiest way to prevent java from forking are to setup the system environments before launching the application. And all this can be done from inside gdb so lets try again!
$ gdb -args java -XX:CompileThreshold=1 -Xbatch -XX:+PrintCompilation JavaApplication
(gdb) break execve
Breakpoint 1 at 0x93b8
(gdb) run
(gdb) call puts(getenv("LD_LIBRARY_PATH"))
/media/disk/4mar-shark-1.8pre-b18-llvm-2.7svn.so-npplugin/jre/lib/arm/server:/media/disk/4mar-shark-1.8pre-b18-llvm-2.7svn.so-npplugin/jre/lib/arm:/media/disk/4mar-shark-1.8pre-b18-llvm-2.7svn.so-npplugin/jre/../lib/arm
$1 = 220
Ok now we know what the LD_LIBRARY_PATH should look like and if we set it before running the java launcher will prevent java from forking using execve, this LD_LIBRARY_PATH and execve madness are thankfully gone in JDK7!
(gdb) set env LD_LIBRARY_PATH=/media/disk/4mar-shark-1.8pre-b18-llvm-2.7svn.so-npplugin/jre/lib/arm/server:/media/disk/4mar-shark-1.8pre-b18-llvm-2.7svn.so-npplugin/jre/lib/arm:/media/disk/4mar-shark-1.8pre-b18-llvm-2.7svn.so-npplugin/jre/../lib/arm
I will do one more thing namely set a gdb breakpoint inside java_md.c:652 right after the hotspot library libjvm.so have been loaded by the java launcher.
(gdb) break java_md.c:652
(gdb) run
The program being debugged has been started already.
Start it from the beginning? (y or n) y
...
Breakpoint 2, LoadJavaVM ... java_md.c:652
652 if (libjvm == NULL) {
This are a good spot to setup new gdb breakpoints inside the loaded libjvm.so that contains the Shark JIT. Finally we are able to place a breakpoint on the line where the Shark JIT failed inside LLVM.
(gdb) break MachineFrameInfo.h:289
(gdb) continue
Continuing.
...
10 b java.lang.String::getChars (66 bytes)
[Switching to Thread 0x67ed96a490 (LWP 21127)]
Breakpoint 3, … at … MachineFrameInfo.h:289
Get a backtrace and try to locate the frame where Shark calls getPointerToFunction
(gdb) bt
...
#9 0x40d4ee68 in llvm::JIT::getPointerToFunction (this=0x9e138, F=0xda6f0)
...
Switch to the getPointerToFunction stack frame
(gdb) frame 9
and finnaly dump the LLVM IR for the function by calling the functions own method dump() !
(gdb) call F->dump()
define internal void @"java.lang.String::getChars"([84 x i8]* %method, i32 %base_pc, [788 x i8]* %thread) {
%1 = getelementptr inbounds [788 x i8]* %thread, i32 0, i32 756 ; [#uses=1]
%zero_stack = bitcast i8* %1 to [12 x i8]* ; [12 x i8]*> [#uses=1]
%2 = getelementptr inbounds [12 x i8]* %zero_stack, i32 0, i32 8 ; [#uses=1]
%stack_pointer_addr = bitcast i8* %2 to i32* ; [#uses=1]
%3 = load i32* %stack_pointer_addr ; [#uses=1]
…
%142 = getelementptr inbounds [17 x i32]* %frame, i32 0, i32 12 ; [#uses=1]
store i32 %31, i32* %142
call void inttoptr (i32 13839116 to void ([788 x i8]*, i32)*)([788 x i8]* %thread, i32 7)
ret void
}
Horray! we have successfully dumped the Shark generated LLVM IR for the problematic method-call. Now simply copy the dump output from the terminal into a file named bug.ll and continue reading.
Check for LLVM CodeGen bugs by testing if the dumped LLVM IR bug.ll file can reproduce the bug using llc
After you have extracted LLVM IR for the problematic method check if you can reproduce the bug using llc..
$ llvm-as < bug.ll | llc
.syntax unified
.eabi_attribute 20, 1
.eabi_attribute 21, 1
.eabi_attribute 23, 3
.eabi_attribute 24, 1
.eabi_attribute 25, 1
.file "”
llc:
/wd/buildbot/llvm-arm-linux/llvm/include/llvm/CodeGen/MachineFrameInfo.h:289:
int64_t llvm::MachineFrameInfo::getObjectOffset(int) const: Assertion
`!isDeadObjectIndex(ObjectIdx) && “Getting frame offset for a dead object?”‘
failed.
0 llc 0×01368414
1 llc 0×01368ccc
2 libc.so.6 0×4021cc10 __default_sa_restorer_v2 + 0
Stack dump:
0. Program arguments: /wd/r96575/Debug/bin/llc -march=arm
1. Running pass ‘Prolog/Epilog Insertion & Frame Finalization’ on function
‘@”java.lang.String::getChars”‘
Aborted
If it crashes using llc then cheer up because you now got a reproducable CodeGen bug and thats great! These kind of crash bugs are on LLVM developers top wanted list because they can fire on any tool that uses LLVM code generation. The best way to report this kind of bugs are to first generate a compact testcase for LLVM that triggers the bug that can be used by the LLVM developers to fix it. It can also be run by the LLVM developers daily regression testing to make sure this bug never hits again.
If it fails to crash with an Aborted like above then you are probably observing a JIT CodeEmitter runtime bug, stay tuned and look forward to my next blog post on “How to fix Shark LLVM JIT CodeEmitter bugs”!
How to generate a bugpoint-reduced-simplified.bc from the bug.ll using bugpoint for CodeGen crash bugs
LLVM ships with a clever tool called bugpoint that are designed to convert dumped blocks of LLVM IR into a compact bugpoint-reduced-simplified.bc LLVM bitcode testcase file that only contains the instructions needed to reproduce the bug.
$ bugpoint -run-llc bug.ll --tool-args -march=arm
Bugpoint work by using deductive logic to break down and remove parts of the bug.ll file and automatically narrow down the LLVM IR lines needed to reproduce the bug. It can take some minutes so be patient but bugpoint will eventually stop and give you a bugpoint-reduced-simplified.bc and print some information on how to reproduce the bug.
File a LLVM bugreport containing the bugpoint-reduced-simplified.bc file
An example of a Shark JIT LLVM bug that have been fixed after submitting a bugpoint-reduced-simplified.bc produced from a dumped Shark metod are :
LLVM PR6478 ARM CodeGen Running pass ‘Prolog/Epilog Insertion & Frame Finalization’ on function ‘@”java.lang.String::getChars”‘
I hope this post have given you some inspiration on how to get Shark LLVM JIT CodeGen crash bugs fixed!
If you want to know more about how bugpoint works and how to officially prepare LLVM bugreports then take a peek at the LLVM documentation: http://llvm.org/docs/HowToSubmitABug.html its great.
Xerxes

I have from time to time been working on a automatic CPU feature tuning code for Shark to make the LLVM JIT generate better code for Cortex-A8 class ARM CPUs. Using it i was able to gain some substantial speed-improvements, all in all it made Shark generate 30% faster code on ARM and the Shark JIT are now able to beat the 2000 point CaffeineMark 3.0 score! I could not resist using CaffeineMark 3.0 for some benchmarking again
.
While this looks all great with rainbows and unicorns the patch are sadly in limbo. I have had some trouble merging the code into LLVM 2.7 trunk before the next LLVM release since I got hit by LLVM problem report 6544 that will force me to redesign the implementation on top of LLVM before it can be commited.
And a similar optimisation like what i did for ARM Linux can easily be done for Shark on PPC Linux as well by adopting the ARM CPU features detection code from ARM to PPC!
For those who are interested the optimising code are submitted and can be fetched from LLVM PR5389.
Cheers and have a great day!
Xerxes
int main(void)
{
void testMe(void)
{
fprintf(stderr, "b0rk3d!\n");
}
testMe();
return (EXIT_SUCCESS);
}
without optimisations, gcc creates a regular function call, with optimisation I guess it depends on how often you use it, but this simple test gets (unsurprisingly, I guess) inlined.
Very cool :)Now we come to the first peak in The Beatles’ career: their 3rd album A Hard Day’s Night. Somewhere I’ve read that this album sounds like somebody opened a bottle of champaign, and this hits the nail on its head. Where the last album With The Beatles was slightly dark, this here is all bright, hyperactive and full of adrenaline. It starts with the weird opening chord (which puzzled even scientists because nobody was able to find out exactly what chord it was until 2008) of the title song, and runs through a set of 13 Beatles originals (yeah right, no cover versions this time). It shows all the good qualities of the early Beatles, their incredible vocal harmonies, their simple-yet-intriguing melodies and rhythms. And it shows the incredible fun and energy that the guys had.
I remember a fun little story. When I was young (much younger than now at least), I saved myself a lot of money (around 30 DM, approx. 15€ nowadays) to buy this album on CD. That must have been in 1992 or so, I was about 14. I didn’t even have a CD player! So I went to the closest supermarket, gave away my hard-earned money to get this CD. And when I was home and saw ‘Mono’ on the package, I was so disappointed, because I believed that this was a fake thing. In my childish view I assumed that the real thing must be stereo because stereo is obviously better, no? In the end, I went back to the supermarket and returned the CD (that was quite a funny and geeky argument with the salesguy, not wanting the CD because it’s mono). But not without first going to a friend and making a copy of it to cassette tape. Yay
Only much later I realized that the mono version was indeed the official version and bought the CD again. Now I have both, and comparing the versions I must say that the mono version is indeed better, again there are a bunch of glitches on stereo (the start of ‘Should Have Known Better’ for example) that have been corrected in mono. It needs to be said that at this time mono was by far the dominant format and the Beatles themselves never really cared for the stereo mixes, leaving them to others, while attending the mono mixes themselves. However, compared to the first two albums, which have this funny extreme left-right panning, the stereo mixes are a bit more sane, which owes to the fact that they used 4 track recording for the first time.
That was a bit technical this time, I usually try to avoid that because we want to listen to music, not technology, right? I enjoyed listening to this album today anyway. Quite alot even. Next time I will have a look (listen) at the Beatles for Sale.
rkennkeWe are pleased to announce the release of IcedTea6 1.7.1!
The IcedTea project provides a harness to build the source code from
OpenJDK6 using Free Software build tools. It also includes the only
Free Java plugin and Web Start implementation, and support for
additional architectures over and above x86, x86_64 and SPARC via the
Zero assembler port.
The tarball can be downloaded from:
The following people helped with the 1.7 release series:
Lillian Angel, Gary Benson, Deepak Bhole, Andrew Haley, Andrew John Hughes, Nobuhiro Iwamatsu, Matthias Klose, Martin Matejovic, Edward Nevill, Xerxes Rånby, Robert Schuster,Jon VanAlten, Mark Wielaard and Man Lung Wong.
We would also like to thank the bug reporters and testers!
To get started:
$ tar xzf icedtea6-1.7.1.tar.gz $ cd icedtea6-1.7.1
Full build requirements and instructions are in INSTALL:
$ ./configure [--enable-visualvm --with-openjdk --enable-pulse-java --enable-systemtap ...] $ make
Moving on from identity and equality of objects, different notions of equality are also surprisingly subtle in some numerical realms.
As comes up from time to time and is often surprising, the "==" operator defined by IEEE 754 and used by Java for comparing floating-point values
(JLSv3 §15.21.1)
is not an equivalence relation. Equivalence relations satisfy three properties, reflexivity (something is equivalent to itself), symmetry (if a is equivalent to b, b is equivalent to a), and transitivity (if a is equivalent to b and b is equivalent to c, then a is equivalent to c).
The IEEE 754 standard defines four possible mutually exclusive ordering relations between floating-point values:
equal
greater than
less than
unordered
A NaN (Not a Number) is unordered with respective to every floating-point value,
including itself. This was done so that NaNs would not quietly slip by without due notice. Since (NaN == NaN) is false, the IEEE 754 "==" relation is not an equivalence relation since it is not reflexive.
An equivalence relation partitions a set into equivalence classes; each member of an equivalence classes is "the same" as the other members of the classes for the purposes of that equivalence relation. In terms of numerics, one would expect equivalent values to result in equivalent numerical results in all cases. Therefore, the size of the equivalence classes over floating-point values would be expected to be one; a number would only be equivalent to itself. However, in IEEE 754 there are two zeros, -0.0 and +0.0, and they compare as equal under ==. For IEEE 754 addition and subtraction, the sign of a zero argument can at most affect the sign of a zero result. That is, if the sum or difference is not zero, a zero of either sign doesn't change the result. If the sum or differnece is zero and one of the arguments is zero, the other argument must be zero too:
-0.0 + -0.0 ⇒ -0.0
-0.0 + +0.0 ⇒ +0.0
+0.0 + -0.0 ⇒ +0.0
+0.0 + +0.0 ⇒ +0.0
Therefore, under addition and subtraction, both signed zeros are equivalent. However, they are not equivalent under division since 1.0/-0.0 ⇒ -∞ but 1.0/+0.0 ⇒ +∞ and -∞ and +∞ are not equivalent.1
Despite the rationales for the IEEE 754 specification to not define == as an equivalence relation, there are legitimate cases where one needs a true equivalence relation over floating-point values, such as when writing test programs, and cases where one needs a total ordering, such as when sorting. In my numerical tests I use a method
that returns true for two floating-point values x and y if:
((x == y) &&
(if x and y are both zero they have the same sign)) ||
(x and y are both NaN)
Conveniently, this is just computed by using (Double.compare(x, y) == 0). For sorting or a total order, the semantics of Double.compare are fine; NaN is treated as being the largest floating-point values, greater than positive infinity, and -0.0 +0.0. That ordering is the total order used by by java.util.Arrays.sort(double[]). In terms of semantics, it doesn't really matter where the NaNs are ordered with respect to ther values to as long as they are consistently ordered that way.2
These subtleties of floating-point comparison were also germane on the Project Coin mailing list last year; the definition of floating-point equality was discussed in relation to adding support for relational operations based on a type implementing the Comparable interface. That thread also broached the complexities involved in comparing BigDecimal
The BigDecimal class has a natural ordering that is inconsistent with equals; that is for at least some inputs bd1 and bd2,
c.compare(bd1, bd2)==0
has a different boolean value than
bd1.equals(bd2).3
In BigDecimal, the same numerical value can have multiple representations, such as (100 × 100) versus (10 × 101) versus (1 × 102). These are all "the same" numerically (compareTo == 0) but are not equals with each other. Such values are not equivalent under the operations supported by BigDecimal; for example (100 × 100) has a scale of 0 while (1 × 102) has a scale of -2.4
While subtle, the different notions of numerical equality each serve a useful purpose and knowing which notion is appropriate for a given task is an important factor in writing correct programs.
1 There are two zeros in IEEE 754 because there are two infinities. Another way to extend the real numbers to include infinity is to have a single (unsigned) projective infinity. In such a system, there is only one conceptual zero. Early x87 chips before IEEE 754 was standardized had support for both signed (affine) and projective infinities. Each style of infinity is more convenient for some kinds of computations.
2 Besides the equivalence relation offered by
Double.compare(x, y), another equivalence relation can be induced by either of the bitwise conversion routines, Double.doubleToLongBits or Double.doubleToRawLongBits. The former collapses all bit patterns that encode a NaN value into a single canonical NaN bit pattern, while the latter can let through a platform-specific NaN value. Implementation freedoms allowed by the original IEEE 754 standard have allowed different processor families to define different conventions for NaN bit patterns.3 I've at times considered whether it would be worthwhile to include an "
@NaturalOrderingInconsistentWithEquals" annotation in the platform to flag the classes that have this quirk. Such an annotation could be used by various checkers to find potentially problematic uses of such classes in sets and maps.4 Building on wording developed for the
BigDecimalspecification under JSR 13, when I was editor of the IEEE 754 revision, I introduced several pieces of decimal-related terminology into the draft. Those terms include preferred exponent, analogous to the preferred scale fromBigDecimal, and cohort, "The set of all floating-point representations that represent a given floating-point number in a given floating-point format." Put in terms ofBigDecimal, the members of a cohort would be all theBigDecimalnumbers with the same numerical value, but distinct pairs of scale (negation of the exponent) and unscaled value.
During the past months i have seen some really cool stuff done using small powerefficient ARM computers and OpenJDK.

Simple Simon PT connected to a hospital laboratory system by using a powerefficient plugcomputer and displaylink usb screen. All powered by OpenJDK
Simple Simon PT connects:
This project hooks up a battery powered laboratory coagulation device, a Simple Simon PT reader to standard hospital laboratory system using a ASTM-1394-1397 / LIS2-A2 connectivity over ethernet. A small ARM based plugcomputer does all data message processing and communication. User interraction are performed by using the Simon reader and a usb-barcode reader to enter laboratory identification. Optionally can a usb-touch-screen be connected for improved user feedback, by displaying charts using JFreeChart, to show and give a better understanding of the coagulation process.
Powerconsumption tops at 15W with the USB screen attached and 6W without. All running silent without any moving parts!
Shark linked against dynamic LLVM .so library
Earlier today I got Shark linked against a shared libLLVM-2.7svn.so generated by using LLVM 2.7svn trunk. It work by simply building LLVM using configure –enable-shared –enable-optimized –disable-assertions and then tweak the Icedtea6 main Makefiles to use the shared library during liking:
Replace the line
LLVM_LIBS = -lLLVMX86Disassembler -lLLVMX86AsmParser -lLLVMMCParser -lLLVMX86AsmPrinter -lLLVMX86CodeGen -lLLVMSelectionDAG -lLLVMAsmPrinter -lLLVMX86Info -lLLVMJIT -lLLVMExecutionEngine -lLLVMCodeGen -lLLVMScalarOpts -lLLVMInstCombine -lLLVMTransformUtils -lLLVMipa -lLLVMAnalysis -lLLVMTarget -lLLVMMC -lLLVMCore -lLLVMSupport -lLLVMSystem
with
LLVM_LIBS = -lLLVM-2.7svn
in the main icedtea6/Makefile and then build Icedtea6 normally, Shark currently builds and works right out of the box when using a LLVM release build!
A cool thing by building shark against the shared library are that you can switch the LLVM JIT that Shark uses from running with or without assertions, debug code, and various extra optimizations by simply replacing the /usr/local/lib/libLLVM-2.7svn.so file with what you want. Linking time during shark builds and shark footprint are impressively smaller as well. Im really happy to see this functionallity in LLVM 2.7!
The LLVM 2.7 code freeze before the 2.7 release happens in about 1.5 weeks from now and i will stay busy for some days to observe and polish the current LLVM svn trunk to be usable with openjdk-6-shark.
Edward Nevill created a ARM Jazelle RTC Thumb2 JIT reference implementation
Meanwhile I have been busy taming Sharks a new kind of Thumb2 JIT have emerged built by Edward Nevill of Cambridge Software Labs! The new Tumb2JIT have been committed into the Icedtea6 trunk and it are a working implementation of Jazelle RTC to be used by ARM Cortex-A8+ class CPUs. It wonderfull that this have been released as free software, Wow!,
Suddenly we got three different JITs to use on ARM with OpenJDK: Cacao, Shark and T2. An opurtunity has emerged to tier them and so I did. Here comes the raw “truth” produced by Caffeine Mark 3! This will probably be the last time i will show off any Caffeine Mark 3 benchmark since it really dont give justice on real world client applications where responsiveness are more crucial than top runtime speed, nevertheless benchmarking using CM30 have always felt fun so here we go: All benchmarks running using a Sharp PC-Z1 Cortex-A8 Mobile internet tool.

Tier between Edwards Thumb 2 JIT , Shark LLVM JIT and Cacao JIT: All running on a ARM Sharp PC-Z1 Mobile internet tool smartbook using OpenJDK 6 compiled with Icedtea6.
This new T2 JIT’s main strenght are reduced jitting time, it basically cuts all jtting time to almost zero and client applications on ARM finnaly runs from tick one. This thumb2 jit makes a really nice java applet browser experience with about 15 seconds first applet startuptime on a ARM smartbook and and all usable instantly after being loaded.
A small 1min 12seconds .3gp movie displaying some java applets running on the Sharp PC-Z1 featuring the new thumb2jit from Icedtea6
Cheers and have a great day!
Xerxes
Today I successfully executed an invokedynamic call on SPARC for the first time. Excellent!
$ bin/jruby.gamma -J-XX:+UseSerialGC -J-Djruby.compile.invokedynamic=true -J-Xint -J-XX:+UnlockExperimentalVMOptions -J-XX:+EnableMethodHandles -J-XX:+EnableInvokeDynamic bench/bench_fib_recursive.rb OpenJDK Server VM (17.0-b08-internal-jvmg) for solaris-sparc JRE (1.7.0), built on Feb 25 2010 04:35:47 by "ct232829" with Workshop 5.9 VM option '+UseSerialGC' VM option '+UnlockExperimentalVMOptions' VM option '+EnableMethodHandles' VM option '+EnableInvokeDynamic' 52.813000 0.000000 52.813000 ( 52.296000) 52.824000 0.000000 52.824000 ( 52.823000) 51.808000 0.000000 51.808000 ( 51.808000) 49.740000 0.000000 49.740000 ( 49.740000) 49.450000 0.000000 49.450000 ( 49.450000)
MethodHandle calls already work since a couple of days and I can run the JDK MethodHandlesTest without any errors:
$ gamma -Xinternalversion OpenJDK Server VM (17.0-b08-internal-jvmg) for solaris-sparc JRE (1.7.0), built on Feb 25 2010 04:35:47 by "ct232829" with Workshop 5.9 $ gamma -Xint -XX:+UnlockExperimentalVMOptions -XX:+EnableMethodHandles -classpath /java/devtools/share/junit/latest/junit.jar:. org.junit.runner.JUnitCore MethodHandlesTestVM option '+UnlockExperimentalVMOptions' VM option '+EnableMethodHandles' JUnit version 4.4 OpenJDK Server VM warning: JSR 292 invokedynamic is disabled in this JVM. Use -XX:+UnlockExperimentalVMOptions -XX:+EnableInvokeDynamic to enable. .IIIIII.findStatic :::::::::::.findVirtual :::::::::::::::.findSpecial ::.bind ::::::::::::::::::::::.unreflect ::::::::::::::::::::::::I.unreflectGetter .unreflectSetter .arrayElementGetter .arrayElementSetter .convertArguments ::::::.permuteArguments .spreadArguments .collectArguments .insertArguments .filterArguments .foldArguments .dropArguments .exactInvoker, genericInvoker, varargsInvoker, dynamicInvoker .guardWithTest .catchException .throwException .testCastFailure Time: 7.984 OK (23 tests)
JSR 292 SPARC support is on its way...
Let’s continue the Beatles reviews that I started with Please Please Me with their next album: With The Beatles.
I remember quite well when I first listened this album for the first time. It is one of those albums that immediately bring up the mood, smell and details of that particular situation and time in my life. I was about 11 years old (I guess) and visiting my father. He had a czech print of this album, and I listened it over his great headphone. Of course I copied it to cassette (at this point cassette and reel tape was the only source of music for me, no thinking of CDs yet). I listened it over and over again in my walkman and couldn’t get enough of it. The weird hard-panned stereo mix allowed for fun experiments: if you only listen to one channel, you get voices and drums (for example) and on the other channel guitars and base, etc. Quite fun. I was a bit disappointed when I bought the CD later that it was only available in mono – now I understand that the mono version is infact much better overall. Like with the first album (and the next albums as well), the stereo mixes have some glitches that were corrected in mono (listen the start of ‘Roll Over Beethoven’ as an example). The album blasts off with ‘It Won’t Be Long’ (doing the Yeah shouting that they became famous for with their single She Loves You – which did not end up on the album) and keeps the energy until the last note of ‘Money’. Compared to their first album it is slightly more complex, they added a piano (and a rumbling piano that is) and refined their sound. Where the first album was sort of a generic beat music thing, this album is definitely Beatles (in full swing of Beatlemania). They also deliver the same basic ingredients that were also present on the first album: love songs, rockers, exotic stuff, cover versions, some George and some Ringo. My favorite is probably the last song ‘Money (that’s what I want)’ (money seems to be a recurring theme in their career, as if they had too little: ‘Can’t Buy Me Love’, ‘Taxman’, ‘Baby You’re a Rich Man’ and finally ‘You Never Give Me Your Money’ and probably some others that I forgot). The cover of the album is also notable: instead of the happy-go-lucky photographs that was to be expected, it shows an arty black and white photo with the four in black polo neck pullovers. To me this is one of my favorite album covers of the Beatles. I really enjoyed listening this album today, it’s a good album for foggy days, but apparently it also works for sunny spring days like today
Please let me know how do you like the album in the comments below, and stay tuned for the next album A Hard Day’s Night.
rkennke
When designing types to be reused by others, there are reasons to favor interfaces over abstract classes.
One complication of using an interface-based approach stems from defining reasonable behavior for the equals and hashCode methods, especially if different implementations are intended to play well together when used in data structures like collections, in particular if an interface type is meant to serve as the key of a map or as the element type of a set.
Some interfaces, like CharSequence, are designed to not be a usable type for a map key or an element type of a set:
[The
CharSequence] interface does not refine the general contracts of theequalsandhashCodemethods. The result of comparing two objects that implementCharSequenceis therefore, in general, undefined. Each object may be implemented by a different class, and there is no guarantee that each class will be capable of testing its instances for equality with those of the other. It is therefore inappropriate to use arbitraryCharSequenceinstances as elements in a set or as keys in a map.
Amongst other problems, CharSequences are not required to be immutable so in general there are always hazards from time of check to time of use conditions.
Even if a type is not suitable as a map key, it can be fine as the type of the value to which a key gets mapped. Likewise, even if type cannot serve as the element type of a set, it can often still be perfectly fine as the element type of a list.
Expanding on a slide from my JavaOne talk
Tips and Tricks for Using Language Features in API Design and Implementation, for interface types intended to be used as map keys or set elements, equality can be defined in several ways.
First, equality can be defined solely in terms of information retrievable from methods of the interface. Alternatively, equality can be defined in terms of information retrievable via the interface methods as well as additional information. Finally, object identity (the == relation) is always a valid definition for equals and often a good implementation choice.
An example of the first kind of equality definition is specified for annotation types:
java.lang.annotation.Annotation.equals(Object):
Returns true if the specified object represents an annotation that is logically equivalent to this one. In other words, returns true if the specified object is an instance of the same annotation type as this instance, all of whose members are equal to the corresponding member of this annotation, as defined below: ...
java.lang.annotation.Annotation.hashCode():
Returns the hash code of this annotation, as defined below:
The hash code of an annotation is the sum of the hash codes of its members (including those with default values), as defined below:...
A consequence of defining equality in this manner is that the hashCode algorithm must also be specified. If it were not specified, the equals/hashCode contract would be violated since equal objects must have equal hashCodes. Therefore, different implementations of this style of interface must have enough information to implement the equals method and have a precise algorithm for hashCode.
An annotation type is a kind of interface. At runtime, dynamic proxies are used to create the core reflection objects implementing annotation types, such as the objects returned by the
getAnnotation method. After a quick identity check, the equals algorithm used in the proxy sees if the annotation type of the two annotation objects is the same and then compares the results of the annotation type's methods. This indirection allows the annotation objects from core reflection to interact properly with other implementations of annotation objects. The annotation objects generated for annotation processing in apt and javac both use the same underlying implementation as core reflection. However, completely independent annotation implementations are fine too. For example, the code below
import javax.annotation.processing.*;
import javax.lang.model.SourceVersion;
import java.lang.annotation.*;
import java.lang.reflect.*;
import java.util.*;
/**
* Demonstrate equality of different annotation implementations.
*/
@SupportedSourceVersion(SourceVersion.RELEASE_6)
public class AnnotationEqualityDemonstration {
static class MySupportedSourceVersion implements SupportedSourceVersion {
private final SourceVersion sourceVersion;
private MySupportedSourceVersion(SourceVersion sourceVersion) {
this.sourceVersion = sourceVersion;
}
public Class annotationType() {
return SupportedSourceVersion.class;
}
public SourceVersion value() {
return sourceVersion;
}
public boolean equals(Object o) {
if (o instanceof SupportedSourceVersion) {
SupportedSourceVersion ssv = (SupportedSourceVersion) o;
return ssv.value() == sourceVersion;
}
return false;
}
public int hashCode() {
return (127 * "value".hashCode()) ^ sourceVersion.hashCode();
}
}
public static void main(String... args) {
SupportedSourceVersion reflectSSV =
AnnotationEqualityDemonstration.class.getAnnotation(SupportedSourceVersion.class);
SupportedSourceVersion localSSV =
new MySupportedSourceVersion(reflectSSV.value());
System.out.println("reflectSSV == localSSV is " +
(reflectSSV == localSSV));
System.out.println("reflectSSV.equals(localSSV) is " +
reflectSSV.equals(localSSV));
System.out.println("localSSV.equals(reflectSSV) is " +
localSSV.equals(reflectSSV));
System.out.println("reflectSSV.getClass()equals(localSSV.getClass()) is " +
reflectSSV.getClass().equals(localSSV.getClass()));
System.out.println("\nreflectSSV.hashCode() is " +
reflectSSV.hashCode());
System.out.println("localSSV.hashCode() is " +
localSSV.hashCode());
}
}
when run outputs:
reflectSSV == localSSV is false reflectSSV.equals(localSSV) is true localSSV.equals(reflectSSV) is true reflectSSV.getClass()equals(localSSV.getClass()) is false reflectSSV.hashCode() is 1867635603 localSSV.hashCode() is 1867635603
The second kind of equality definition is specified for the language modeling interfaces in the javax.lang.model.element package:
javax.lang.model.element.Element.equals(Object):
Note that the identity of an element involves implicit state not directly accessible from the element's methods, including state about the presence of unrelated types. Element objects created by different implementations of these interfaces should not be expected to be equal even if "the same" element is being modeled; this is analogous to the inequality ofClassobjects for the same class file loaded through different class loaders.
Inside javac, instance control is used for the implementation classes for javax.lang.model.element.Element subtypes. This allows the default pointer equality to be used and allows the hashing algorithm to not be specified. Just as you can't step in the same river twice, the identity of an Element object is tied to the context in which it is created. Operationally, one consequence of this context sensitivity is that Element objects modeling "the same" type produced during different rounds of annotation processing will not be equal even if there are equivalent methods, fields, constructors, etc. in both types in both rounds.
When independent implementations of an interface and not required to be equal to one another, the hashCode algorithm does not need to be specified, providing the implementer more flexibility. This second style of specification allows disjoint islands of implementations to be defined.
Which style of specification is more appropriate depends on how the interface type is intended to be used. Defining interoperable implementation is more difficult and limits the ability of the interface to be retrofitted onto existing types. For example, while the Element interface and other interfaces from JSR 269 were successfully implemented by classes in both javac and Eclipse, it would be impractical to expect Element objects from those disparate implementations to compare as equal.
Mixin interfaces, like CharSequence and Closeable, should be cautious in defining equals behavior if the interface is intended to be widely implemented. In some cases, a mixin interface can finesse this issue by being limited to an existing type hierarchy with already defined equals and hashCode polices. For example, the Parameterizable and QualifiedNameable interfaces added to the javax.lang.model.element package in JDK 7
(6460529)
are extensions to javax.lang.model.Element and therefore get to reuse the existing policies quoted above.
At the end of last year, EMI/Apple re-released the whole Beatles catalogue as a series (or box sets) of remastered CDs. The Beatles were more or less my first encounter with pop music when I was 8 or so, my first self-bought CD at the age of 12 was a Beatles CD and of course, I had to go out and buy these excellent remasters, I got both the mono and stereo editions. I had many weeks now listening these CDs and re-discovering some of the stuff was quite fun, and I want to share my impressions in a short series of short reviews intermingled with some personal anecdotes. Let’s do it in order of the release of the original LPs and start with ‘Please Please Me‘.
I think this is quite a good career starter. Innocent, powerful, inspired, raw, rocking. It has most of the ingredients of their later Beatlemania success albums, i.e. love songs (P.S. I love you and most others), rockers (I Saw Her Standing There,…), some cover versions , a slightly more exotic song (A Taste of Honey), some Ringo (Boys), some George (Do You Want To Know a Secret), a weird sense of humor (the Beatles covering a girl group talking about ‘Boys’). I really always enjoy this album. It was recorded in one day, and with John having a sore throat, and it shows – positively. From the first through the last song you can literally feel the energy of those 4 relatively unknown guys, and sense that the Beatlemania is already about to start. It is what the Beatles were at this point: a hard working live band. Interesting the last song: a kamikaze version of Twist and Shout, with John shredding his vocal chords. Hilarious. It could have been a total fail (with no second chance to record it), but instead it became music history. The stereo version is mostly relevant for historical reasons, the sound beeing a bit thin and the mixes having several glitches which were corrected in the mono mixes. These mono mixes on the other hand are just great and solid rocking. All in all the album always makes for a happy-go-lucky listen on sunny afternoons, even if it is not one of the totally essential ones. Also interesting to note is also the album cover: it shows the four guys on the stairs in the EMI building. Several years later they did the exact same scene again, now with long hair, for the planned last album (Get Back, which should become Let it Be). Both pictures would later be used for the red and blue best of double albums.
Next one up will be With The Beatles. Stay tuned. Please tell me what you think about the album in the comments.
rkennkeCatching up on writing about more numerical work from years past, the second article in a two-part series finished last year discusses some low-level floating-point manipulations methods I added to the platform over the course of JDKs 5 and 6. Previously, I published a blog entry reacting to the first part of the series.
JDK 6 enjoyed several numerics-related library changes. Constants for MIN_NORMAL, MIN_EXPONENT, and MAX_EXPONENT were added to the Float and Double classes. I also added to the Math and StrictMath classes the following methods for low-level manipulation of floating-point values:
public static double copySign(double magnitude, double sign)
public static int getExponent(double d)
public static double nextAfter(double start, double direction)
public static double nextUp(double d)
public static double scalb(double d, int scaleFactor)
There are also overloaded methods for float arguments.
In terms of the IEEE 754 standard from 1985, the methods above provide the core functionality of the recommended functions. In terms of the 2008 revision to IEEE 754, analogous functions are integrated throughout different sections of the document.
While a student at Berkeley, I wrote a tech report on algorithms I developed for an earlier implementation of these methods, an implementation written many years ago when I was a summer intern at Sun. The implementation of the recommended functions in the JDK is a refinement of the earlier work, a refinement that simplified code, added extensive and effective unit tests, and sported better performance in some cases. In part the simplifications came from not attempting to accommodate IEEE 754 features not natively supported in the Java platform, in particular rounding modes and sticky flags.
The primary purpose of these methods is to assist in in the development of math libraries in Java, such as the recent
pure Java implementation of floor and ceil
(6908131).
This expected use-case drove certain API differences with the functions sketched by IEEE 754. For example, the getExponent method simply returns the unbiased value stored in the exponent field of a floating-point value rather than doing additional processing, such as computing the exponent needed to normalized a subnormal number, additional processing called for in some flavors of the 754 logb operation. Such additional functionality can actually slow down math libraries since libraries may not benefit from the additional filtering and may actually have to undo it.
The Math and StrictMath specifications of copySign have a small difference: the
StrictMath version always treats NaNs as having a positive sign (a sign bit of zero) while the
Math version does not impose this requirement.
The IEEE standard does not ascribe a meaning to the sign bit of a NaN and difference processors have different conventions NaN representations and how they propagate. However, if the source argument is not a NaN, the two copySign methods will produce equivalent results.
Therefore, even if being used in a library where the results need to be completely predictable, the faster Math version of copySign can be used as long as the source argument is known to be numerical.
The recommended functions can also be used to solve a little floating-point puzzle: generating the interesting limit values of a floating-point format just starting with constants for 0.0 and 1.0 in that format:
NaN is 0.0/0.0.
POSITIVE_INFINITY is 1.0/0.0.
MAX_VALUE is nextAfter(POSITIVE_INFINITY, 0.0).
MIN_VALUE is nextUp(0.0).
MIN_NORMAL is MIN_VALUE/(nextUp(1.0)-1.0).
I fixed a major bug in the automagic .NET serialization support.
Changes:
Binaries available here: ikvmbin-0.42.0.5.zip
Sources: ikvm-0.42.0.5.zip, openjdk6-b16-stripped.zip
Joe Darcy released OpenJDK6 b18 yesterday evening and this is now integrated with the latest IcedTea6 from Mercurial.
To build:
$ hg clone http://icedtea.classpath.org/hg/icedtea6 $ mkdir icedtea6-build $ cd icedtea6-build $ ../icedtea6/configure $ make
Then grab a beverage of your choice and wait a while… I recommend adding --with-parallel-jobs to the configure invocation if you have a multi-core machine.
Once built, you will have a build of OpenJDK6 which can use the Nimbus Look and Feel as explained in my previous blog
Running with the usual jtreg flags, -a -ignore:quiet in all repositories and adding -s for langtools, the basic regression test results on Linux for OpenJDK 6 build 18 are:
HotSpot, 24 tests passed.
Langtools, 1,355 tests passed.
JDK, 3,148 tests pass, 19 fail, and 2 have errors.
All the HotSpot tests continue to pass:
0: b17-hotspot/summary.txt pass: 24 1: b18-hotspot/summary.txt pass: 24 No differences
In langtools all the tests continue to pass and one test was added:
0: b17-langtools/summary.txt pass: 1,354 1: b18-langtools/summary.txt pass: 1,355 0 1 Test --- pass tools/javac/T6855236.java 1 differences
And in jdk, a few dozen new tests were added in b18 and the existing tests have generally consistent results, with a number of long-standing test failures corrected by Pavel Tisnovsky. The test run below was executed outside of Sun's and Oracle's wide-area network using the following contents for the testing network configuration file:
host=icedtea.classpath.org refusing_host=ns1.gnu.org far_host=developer.classpath.org
The file location to use for the networking configuration can be set by passing a -e JTREG_TESTENV=Path to file option to jtreg.
0: b17-jdk/summary.txt pass: 3,118; fail: 26 1: b18-jdk/summary.txt pass: 3,148; fail: 19; error: 2 0 1 Test --- pass com/sun/java/swing/plaf/nimbus/Test6741426.java --- pass com/sun/java/swing/plaf/nimbus/Test6849805.java --- pass com/sun/jdi/BreakpointWithFullGC.sh --- pass com/sun/jdi/ResumeOneThreadTest.java --- pass com/sun/jdi/SimulResumerTest.java --- pass demo/jvmti/compiledMethodLoad/CompiledMethodLoadTest.java fail pass java/awt/Frame/DynamicLayout/DynamicLayout.java fail pass java/awt/Frame/MaximizedToIconified/MaximizedToIconified.java fail pass java/awt/Frame/ShownOffScreenOnWin98/ShownOffScreenOnWin98Test.java fail pass java/awt/Frame/UnfocusableMaximizedFrameResizablity/UnfocusableMaximizedFrameResizablity.java --- pass java/awt/GraphicsDevice/CloneConfigsTest.java fail pass java/awt/GridLayout/LayoutExtraGaps/LayoutExtraGaps.java fail pass java/awt/Insets/CombinedTestApp1.java fail pass java/awt/KeyboardFocusmanager/TypeAhead/ButtonActionKeyTest/ButtonActionKeyTest.html fail pass java/awt/KeyboardFocusmanager/TypeAhead/MenuItemActivatedTest/MenuItemActivatedTest.html fail pass java/awt/KeyboardFocusmanager/TypeAhead/SubMenuShowTest/SubMenuShowTest.html fail pass java/awt/KeyboardFocusmanager/TypeAhead/TestDialogTypeAhead.html pass fail java/awt/Multiscreen/LocationRelativeToTest/LocationRelativeToTest.java fail pass java/awt/TextArea/UsingWithMouse/SelectionAutoscrollTest.html fail pass java/awt/Toolkit/ScreenInsetsTest/ScreenInsetsTest.java pass --- java/awt/Window/AlwaysOnTop/AlwaysOnTopEvenOfWindow.java pass fail java/awt/Window/GrabSequence/GrabSequence.java fail pass java/awt/grab/EmbeddedFrameTest1/EmbeddedFrameTest1.java pass fail java/awt/print/PrinterJob/ExceptionTest.java --- pass java/lang/ClassLoader/UninitializedParent.java pass fail java/net/ipv6tests/TcpTest.java pass fail java/nio/channels/SocketChannel/AdaptSocket.java pass fail java/nio/channels/SocketChannel/LocalAddress.java pass fail java/nio/channels/SocketChannel/Shutdown.java pass fail java/rmi/transport/pinLastArguments/PinLastArguments.java --- pass java/util/TimeZone/OldIDMappingTest.sh --- pass java/util/TimeZone/TimeZoneDatePermissionCheck.sh --- pass javax/swing/JButton/6604281/bug6604281.java fail pass javax/swing/JTextArea/Test6593649.java --- pass javax/swing/Security/6657138/ComponentTest.java --- pass javax/swing/Security/6657138/bug6657138.java --- pass javax/swing/ToolTipManager/Test6657026.java --- pass javax/swing/UIManager/Test6657026.java --- pass javax/swing/plaf/basic/BasicSplitPaneUI/Test6657026.java --- pass javax/swing/plaf/metal/MetalBorders/Test6657026.java --- pass javax/swing/plaf/metal/MetalBumps/Test6657026.java --- pass javax/swing/plaf/metal/MetalInternalFrameUI/Test6657026.java --- pass javax/swing/plaf/metal/MetalSliderUI/Test6657026.java pass error sun/java2d/OpenGL/GradientPaints.java pass fail sun/rmi/transport/proxy/EagerHttpFallback.java --- pass sun/security/provider/certpath/DisabledAlgorithms/CPBuilder.java --- pass sun/security/provider/certpath/DisabledAlgorithms/CPValidatorEndEntity.java --- pass sun/security/provider/certpath/DisabledAlgorithms/CPValidatorIntermediate.java --- pass sun/security/provider/certpath/DisabledAlgorithms/CPValidatorTrustAnchor.java pass error sun/security/ssl/javax/net/ssl/NewAPIs/SessionTimeOutTests.java --- pass sun/security/tools/jarsigner/emptymanifest.sh --- pass sun/security/util/DerValue/BadValue.java fail pass sun/tools/jhat/HatHeapDump1Test.java fail pass sun/tools/native2ascii/NativeErrors.java 54 differences
On February 16, 2010 the source bundle for OpenJDK 6 b18 was published.
Major changes in this build include the latest round of security fixes and, courtesy of Andrew John Hughes, a backport of the Nimbus look and feel from JDK 7. In addition, a
new delivery model is being used for the jaxp and jax-ws components.
A detailed list of all the changes is also available.
There‘s a lot of hype around this title. I gave a look at the trailers and I had the feeling that the game play was so similar to very old games (1983!) such as Dragon‘s Lair or Space Quest.
Today, I finally played the demo and I still have constrasting feelings… Is it cool? Yeah… I think it is, really: it‘s like playing a film, full of dark mood, beautiful scenes and colours, and a really intriguing story.
The demo shows some funny behaviour, for example, when not triggering a cut scene (that is, 90% of the time you are in a cut scene), the non playing characters (NPEs) look a bit dumb. In fact, you have the feeling that everything is a bit fake, because the interaction takes place only when is supposed to, and you find yourself wandering around the area without knowing what to do when you have no more options left, with the NPCs staring a wall with idiotic espression :)
There is a nice system that tries to address this issue and adds even more of a nice movie mood to the game: you can listen to your character‘s thoughts, and they act like an offscreen narrator, very cool I have to say.
The game play is almost non existent, though, you press one button or two when some HUDs pops up and inevitably mess with the Sixaxis controller… On the other hand, is really beautiful to watch. I suggest to wait a full review on Gamespot, but the game is cool so far.
I confirm that is the same kind of game as Dragon‘s Lair and Space Ace, be warned if you don‘t like this kind of stuff.
A master programmer passed a novice programmer one day. The master noted the novice's preoccupation with a hand-held computer game. "Excuse me", he said, "may I examine it?" The novice bolted to attention and handed the device to the master. "I see that the device claims to have three levels of play: Easy, Medium, and Hard", said the master. "Yet every such device has another level of play, where the device seeks not to conquer the human, nor to be conquered by the human." "Pray, great master," implored the novice, "how does one find this mysterious setting?" The master dropped the device to the ground and crushed it under foot. And suddenly the novice was enlightened. -- Geoffrey James, "The Tao of Programming"
Writing up a piece of old work for some more
Friday fun, an example of
testing where the failures are likely to be led to my independent discovery of a bug in the FDLIBM pow function, one of only two bugs fixed in
FDLIBM 5.3.
Even back when this bug was fixed for Java some time ago
(5033578),
the FDLIBM library was well-established, widely used in the Java platform and elsewhere, and already thoroughly tested so I was quite proud my tests found a new problem. The next most recent change to the pow implementation was eleven years prior to the fix in 5.3.
The specification for Math.pow is involved, with over two dozen special cases listed. When setting out to write tests for this method, I re-expressed the specification in a tabular form to understand what was going on. After a few iterations reminiscent of tweaking a Karnaugh map, the table below was the result.
| xy | y | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| x | –∞ | –∞ < y < 1 | –1 | –1 < y < 0 | –0.0 | +0.0 | 0 < y < 1 | 1 | 1 < y < +∞ | +∞ | NaN |
| –∞ | +0.0 | f2(y) | 1.0 | f1(y) | +∞ | NaN | |||||
| –∞ < y < –1 | +0.0 | f3(x, y) | f3(x, y) | +∞ | |||||||
| –1 | NaN† | NaN† | |||||||||
| –1 < y < 0 | +∞ | +0.0 | |||||||||
| –0.0 | +∞ | f1(y) | f2(y) | +0.0 | |||||||
| +0.0 | +∞ | +0.0 | |||||||||
| 0 < y < 1 | +∞ | x | +0.0 | ||||||||
| 1 | NaN† | 1.0 | NaN† | ||||||||
| 1 < y < +∞ | +0.0 | x | +∞ | ||||||||
| +∞ | +0.0 | +∞ | |||||||||
| NaN | NaN | NaN | |||||||||
f1(y) = isOddInt(y) ? –∞ : +∞;
f2(y) = isOddInt(y) ? –0.0 : +0.0;
f3(x, y) = isEvenInt(y) ? |x|y : (isOddInt(y) ? –|x|y : NaN);
† Defined to be +1.0 in C99, see §F.9.4.4 of the C99 specification. Large magnitude finite floating-point numbers are all even integers (since the precision of a typical floating-point format is much less than its exponent range, a large number will be an integer times the base raised to a power). Therefore, by the reasoning of the C99 committee,pow(-1.0, ∞)was likepow(-1.0, Unknown large even integer)so the result was defined to be1.0instead ofNaN.
The range of arguments in each row and column are partitioned into eleven categories, ten categories of finite values together with NaN (Not a Number). Some combination of x and y arguments are covered by multiple clauses of the specification.
A few helper functions are defined to simplify the presentation. As noted in the table, a cross-platform wrinkle is that the C99 specification, which came out after Java was first released, defined certain special cases differently than in FDLIBM and Java's Math.pow.
A regression test based on this tabular representation of pow special cases is
jdk/test/java/lang/Math/PowTests.java. The test makes sure each interesting combination in the table is probed at least once. Some combinations receive multiple probes.
When an entry represents a range, the exact endpoints of the range are tested; in addition, other interesting interior points are tested too. For example, for the range 1 < x< +∞ the individual points tested are:
+1.0000000000000002, // nextAfter(+1.0, +oo) +1.0000000000000004, +2.0, +Math.E, +3.0, +Math.PI, -(double)Integer.MIN_VALUE - 1.0, -(double)Integer.MIN_VALUE, -(double)Integer.MIN_VALUE + 1.0, double)Integer.MAX_VALUE + 4.0, (double) ((1L<<53)-1L), (double) ((1L<<53)), (double) ((1L<<53)+2L), -(double)Long.MIN_VALUE, Double.MAX_VALUE,
Besides the endpoints, the interesting interior points include points worth checking because of transitions either in the IEEE 754 double format or a 2's complement integer format.
Inputs that used to fail under this testing include a range of severities, from the almost always numerical benign error of returning a wrongly signed zero, to returning a zero when the result should be finite nonzero result, to returning infinity for a finite result, to even returning a wrongly singed infinity!
Selected Failing Inputs
Failure for StrictMath.pow(double, double): For inputs -0.5 (-0x1.0p-1) and 9.007199254740991E15 (0x1.fffffffffffffp52) expected -0.0 (-0x0.0p0) got 0.0 (0x0.0p0). Failure for StrictMath.pow(double, double): For inputs -0.9999999999999999 (-0x1.fffffffffffffp-1) and 9.007199254740991E15 (0x1.fffffffffffffp52) expected -0.36787944117144233 (-0x1.78b56362cef38p-2) got -0.0 (-0x0.0p0). Failure for StrictMath.pow(double, double): For inputs -1.0000000000000004 (-0x1.0000000000002p0) and 9.007199254740994E15 (0x1.0000000000001p53) expected 54.598150033144236 (0x1.b4c902e273a58p5) got 0.0 (0x0.0p0). Failure for StrictMath.pow(double, double): For inputs -0.9999999999999998 (-0x1.ffffffffffffep-1) and 9.007199254740992E15 (0x1.0p53) expected 0.13533528323661267 (0x1.152aaa3bf81cbp-3) got 0.0 (0x0.0p0). Failure for StrictMath.pow(double, double): For inputs -0.9999999999999998 (-0x1.ffffffffffffep-1) and -9.007199254740991E15 (-0x1.fffffffffffffp52) expected -7.38905609893065 (-0x1.d8e64b8d4ddaep2) got -Infinity (-Infinity). Failure for StrictMath.pow(double, double): For inputs -3.0 (-0x1.8p1) and 9.007199254740991E15 (0x1.fffffffffffffp52) expected -Infinity (-Infinity) got Infinity (Infinity).
The code changes to address the bug were fairly simple; corrections were made to extracting components of the floating-point inputs and sign information was propagated properly.
Even expertly written software can have errors and even long-used software can have unexpected problems. Estimating how often this bug in FDLIBM caused an issue is difficult, while the errors could be egregious, the needed inputs to elicit the problem were arguably unusual (even though perfectly valid mathematically). Thorough testing is key aspect of assuring the quality of numerical software, it is also helpful for end-users to be able to examine the output of their programs to help notice problems.
Doug pointed out yesterday how Eclipse startup performance isn’t great. I asked on his blog entry, wouldn’t it be great if we had something like bootchart — or Michael Meeks‘ work (FOSDEM presentation) — for Eclipse?
Would this be a good Eclipse Summer of Code project?
The
initial way a string switch statement was implemented in JDK 7 was to desugar a string switch into a series of two switch statements, the first switch mapping from the argument string's hash code to the ordinal position of a matching string label followed by a second switch mapping from the computed ordinal position to the code to be executed.
Before this approach was settled on, Jon, Maurizio, and I had extensive discussions about alternative implementation techniques. One approach from Maurizio we seriously considered using employed labeled break statements (in lieu of unavailable goto statements) to allow a string switch to be desugared into a single integer switch statement. In this approach as well, the basis for the integer switch built around the strings' hash codes.
One kind of complication in desugaring string switch statements stems from irregular control flow, such as when control transfers to one label, code is executed, and then control falls through to the code under the next label rather than exiting the switch statement after the initial code execution. When using hash codes to identify the string being switched on, another class of complications stem from dealing with the possibility of hash collisions, the situation where two distinct strings have the same hash code. A string can be constructed to have any integer hash code so collisions are always a possibility. Since many strings have the same hash code, it is not sufficient to verify the string being switched on just has the same hash value as a string case label; the string being switched on must be checked for equality with the case label string. Furthermore, when two string case labels have the same hash value, a string being switched on with a matching hash code must be checked for equality potentially against both case labels.
While relying on hash codes to implement string switch is contentious with some, the hashing algorithm of java.lang.String is an extremely stable part of the platform and there would be too much behavioral compatibility risk in changing it. Therefore, the stability of the algorithm can be relied on as a resource in possible string switch implementations.
Switching on the hash code in the desugaring confers a number of benefits. First, most immediately the hash code maps the string to an integer value, matching the type required for the existing switch statement.
Second, switching on the hash code of a string bounds the worst case behavior. The simplest way to see if a chosen string is in a set of other strings, such as the set of string case labels, would be to compare the chosen string to each of the strings in the set. This could be expensive since the chosen string would need to be traversed many times, potentially once for each case label. The hash code of a string is typically cached after it is first computed. Therefore, when switching on the hash code, the chosen string is not expected to be traversed more than twice (once to compute the hash code if not cached, again to compare against strings from the set of strings with the same hash value — a set usually with only one element).
If instead of a series of two desugared switch statements, only a single switch statement were desired in the desugaring, extra synthetic state variables could be used to contend with hash collisions, fall-through control flows, and default cases,
as described in the Project Coin strings in switch proposal. A goto construct could be used to eliminate state variables, but goto is neither available in the source language nor in javac's intermediate representation. However, by a novel use of nested labeled breaks, a single switch statement can be used in the desugaring without introducing additional synthetic control variables.
Consider the strings switch statement in the method f below
static void f(String s) { // Original sugared code
switch (s) {
case "azvl":
System.out.println("azvl: "+s); // fallthrough
case "quux":
System.out.println("Quux: "+s); // fallthrough
case "foo":
int i = 5; //fallthrough
case "bar":
System.out.println("FooOrBar " + (i = 6) + ": "+s);
break;
case "bmjrabc": // same hash as "azvl"
System.out.println("bmjrabc: "+s);
break;
case "baz":
System.out.println("Baz " + (i = 7) + ": "+s); // fallthrough
default:
System.out.println("default: "+s);
}
}
and the following desugaring procedure.
Create a labeled block to enclose the entire switch statement. Within that enclosing block, create a series of nested blocks, one for each case label, including a default option, if any. In the innermost block, have a switch statement based on the hash code of the strings in the original case labels. For each hash value present in the set of case labels, have an if-then-else chain comparing the string being switched on to the cases having that hash value, breaking to the corresponding label if there is a match. If a match does not occur, if the original switch has a default option, a break should transfer control to the label for the default case; if the original case does not have a default option, a break should occur to the switch exit label.
If a hash value only corresponds to a single case label, the sense of the equality/inequality comparison in the desugared code can be tuned for branch prediction purposes. After the block for a case label is closed, the code for that alternative appears. In the original switch code, there are two normal completion paths of interest: the code for an alternative is run and execution falls through to the next alternative or there is an unlabeled break to exit the switch. In the desugaring, these paths are represented by execution falling through to code for the next alternative and by a labeled break to the label synthesized for the switch statement exit. The preservation of fall through semantics is possible because the code interspersed in the nested labeled statements appears in the same textual order as in the original "sugared" string switch. Local variables can be declared in the middle of a switch block. In desugared code, such variable declarations are hoisted out to reside in the block for the entire switch statement; the declaration of the variable and its uses are then renamed to a synthetic value to avoid changing the meaning of names in other scopes. Sample results of this procedure are shown below.
static void f(String s) { // Desugared code
$exit: {
int i$foo = 0;
$default_label: {
$baz: {
$bmjrabc: {
$bar: {
$foo: {
$quux: {
$azvl: {
switch(s.hashCode()) { // cause NPE if s is null
case 3010735: // "azvl" and "bmjrabc".hashCode()
if (s.equals("azvl"))
break $azvl;
else if (s.equals("bmjrabc"))
break $bmjrabc;
else
break $default_label;
case 3482567: // "quux".hashCode()
if (!s.equals("quux")) // inequality compare
break $default_label;
break $quux;
case 101574: // "foo".hashCode()
if (s.equals("foo")) // equality compare
break $foo;
break $default_label;
case 97299: // "bar".hashCode()
if (!s.equals("bar"))
break $default_label;
break $bar;
case 97307: // "baz".hashCode()
if (!s.equals("baz"))
break $default_label;
break $baz;
default:
break $default_label;
}//switch
}//azvl
System.out.println("azvl: "+s); // fallthrough
} //quux
System.out.println("Quux: "+s); // fallthrough
} //foo
i$foo = 5;
}//bar
System.out.println("FooOrBar " + (i$foo = 6) + ": "+s);
break $exit;
}//bmjrabc
System.out.println("bmjrabc: " + s);
break $exit;
} //baz
System.out.println("Baz " + (i$foo = 7) + ": "+s); // fallthrough
}//default_label
System.out.println("default: "+s);
}//exit
}
While the series of two switches and the labeled break-based desugaring were both viable alternatives, we choose the series of two switches since the transformation seemed more localized and straightforward. The two-switch solution also has simpler interactions with debuggers. If string switches become widely used, profiling information can be used to guide future engineering efforts to optimize their performance.
A friend gave me a CD from Paul Kalkbrenner, and surprisingly, I really quite like it. Check it out:
Speaking of Berlin calling, if all goes well, we’ll be moving to Berlin (area) later this year. Berlin is indeed calling (me)
.
rkennke![]()