psychology blog: dev

‏إظهار الرسائل ذات التسميات dev. إظهار كافة الرسائل

السبت، 2 أكتوبر 2010

Finally! JavaFX script is dead

I never fully understood JavaFX script. It seemed like a nice enough language, but it neither completely replaced Java nor integrated well with it. Furthermore, some of the features of its library were sorely missing from Java, but not accessible from it. This is why it is refreshing to see the new JavaFX Roadmap. Its main tenet is that JavaFX script is being discontinued and that JavaFX becomes a framework for Java with many exciting features:

Binding API: Well, OK, yet another one. Here is hoping that it will not be too cumbersome and find broad acceptance. The API will include observable collections (which are handy for GUI lists and tables).

Media framework: for audio and video. About time.

HTML5 support: parsing and display. Also desperately needed in Swing (SWT already has reasonable HTML display support).

New table control: Apart from better looks, combining it with observable collections should get one the comfort offered by Glazed Lists.

New rich text control.

Finally Java gets back the human resources that had been moved to JavaFX script by Sun. The new JavaFX features will make it much more appealing for developing desktop applications.

Update 2010-12-15: Why the new JavaFX makes sense

الجمعة، 24 سبتمبر 2010

Usability: phone numbers and special characters

Why do so many web sites insist that, when entering a phone number, you not type any non-digits such as spaces, hyphens, parentheses or slashes? The structure that these characters bring to phone numbers help humans considerably and computers can easily filter them out. So what is the harm of allowing them? I would even store phone numbers in a database as they were entered.

الجمعة، 17 سبتمبر 2010

Seven don’ts for websites

Source: xkcd

The following is a list of seven things that frequently bug me about websites.

Intro page where you have to choose your location or language. If I go to foo.com then the .com indicates that I want to see the American version of the Foo Inc. website. I do not want to see a map of the world where I need to click several times to finally get to my destination. Example: gilette.de. A better solution is to put a link somewhere that allows you to jump to other versions. Flag icons work well here, because you don’t need to understand the language of the current version to jump to a different one (e.g. finding “Germany” on a Chinese website is difficult if you don’t read Chinese).

Site version determined by IP location. If you use a web browser, the websites you are visiting know your IP address (five digits such as 127.0.0.1). In principle, this address is completely abstract (as opposed to, say, ZIP codes which can be mapped to a location), but there are databases that allow you to map it back to a location. This works pretty well, but is sometimes abused to automatically switch to the language of your location. But what if you are an American who is abroad and wants to access an American website. Or if you are German and want to check out an American website (not its German version). Example: eonline.com. Solution: A small note in the language the site thinks it has detected. Something like “Click here to see the English version of this page”.

Skippable intro page. Often such a page shows a movie that tells you what the website is about. By all means, link to introductory information on the home page, but don’t force me to watch/read it, every time.

Complete site implemented in Flash. I don’t particularly like Flash. It still has its uses for video, but most other things can now be done in HTML5. With Flash, you cannot bookmark pages or copy text. The website’s content cannot be found via Google and it won’t work on (most) mobile devices. Furthermore, most Flash websites make up strange new ways of navigation. Why change something that people know and that works well? Example: gilette.de.

Ugly URLs. URLs should be compact and easily understandable by humans. That is, one should be able to figure out what a page is about by looking at the URL. Thus, if the page ever goes away, one has a greater chance of finding out where it went. Amazon is both a sinner and a saint here. Some Amazon URLs have a lot of ugly pieces in them (“ref” and such). On the other hand, book URLs sometimes include the ISBN and an abbreviation of the title. This is a great practice, because the ISBN is a unique ID that is useful to both machines and humans and because the title allows humans to figure out what is there. Lazy programmers sometimes let the fact that there is a single script that displays all web pages show up in the URL: www.example.com/display.php?page=start. Even worse are meaningless page IDs (?page=17). Both can be avoided by putting in a little more effort.

Content is hard to find. Often, a website shows all kinds of details, so that the things that people are most frequently looking for are hard to find. Especially expert-designed web pages suffer from this, because it is often difficult to focus for experts (in their area of expertise). Examples are bank and government websites. But food websites are also often problematic: I don’t want to play a game, I want to find out about the products and/or their ingredients. Example: See picture above.

Error page discards original URL. This is fortunately rare, but every now and then, I discover an error page in my browser that says “page does not exist”. The problem is that the original URL is nowhere to be found: The error page URL has replaced it in the browser address bar (=history cannot be used) and it isn’t displayed on the error page. If there are many tabs open then it’s really hard to figure out what went wrong.

Bonus: Forbidding or enforcing characters in passwords. As if passwords weren’t annoying enough on their own, some sites decide to make handling them even more complicated. Hassles I’ve experienced so far were: At most 10 characters allowed, no punctuation allowed, must use a digit.

الأربعاء، 11 أغسطس 2010

Great channel for semantic web videos

Check it out on Vimeo.

الأحد، 1 أغسطس 2010

Why we are actually writing getters and setters in Java

I now often hear the opinion that writing getters and setters has something to do with better encapsulation, that using “naked” fields is bad practice. To find out if this is true, we have to look at Java history. In the mid-nineties, Sun developed the Java Bean Specification as a component model for Java. This model was supposed to help with tool support for Java, e.g. when connecting a graphical user interface with domain objects. In this case, it is useful if one can observe changes made to fields and react to them (e.g. by updating the text displayed in a window). Alas, while there are some languages that allow this kind of meta-control (Python and Common Lisp come to mind), Java does not. Thus, Java Beans introduced standardized naming that allowed one to implement a field as a pair of methods which then would manually implement the observation.

I usually code as follows: If I need just a field, I use a public field (no getters and setters), because it helps me to get started quickly and introduces less clutter. If I later change my mind, I let Eclipse introduce the indirection of the getter and setter. That means that there is no penalty for such a change and no need to think ahead! Granted, having both public fields and getters/setters affects uniformity, but the added agility is worth it for me.

Obviously, it would be nice if Java had true observable (and optionally computable) fields. This feature was initially on the table for Java 7, but did not make the cut. Maybe IDEs could help by displaying getters and setters as if they were fields. Their source code would be hidden, with visual clues indicating if such a pseudo-field is read-only etc. Additionally, auto-expansion would be improved, because pseudo-getters (such as Collection.size()), getters/setters, and fields would all be part of the same category. No more typing “.get” and hoping that the information that you are looking for is available as a properly named getter. The same kind of grouping should also be made in JavaDoc. Lastly, one could display foo.setValue("abc") as foo.value = "abc". But I’m not sure if that makes sense.

Addendum (2010-08-07): I think I did not make my point clear. It was not “use public fields”, it was “don’t use getters and setters blindly”. I’m applying the coding style mentioned above during an exploratory phase of coding. IDEs such as Eclipse allow you to do this kind of quick and dirty exploration because real getters and setters are always just a refactoring away. I do agree that, as soon as the API and its client code are not in the same code base, you cannot do these refactorings, any more. Thus, you have to think ahead and freeze some things.

As for generating getters and setters: Yes, Eclipse does that for you. It even expands getFoo into foo getter source code and setFoo into foo setter source code. And it can also rename the setters and getters for you while renaming a field. Even then, getters and setters still add clutter.

السبت، 31 يوليو 2010

The slides from the JVM language summit are online

The slides from the JVM language summit (summary) are now online. There is some good stuff, for example: “Engineering Fine-Grained Parallelism in Java” by Doug Lea.

الثلاثاء، 27 يوليو 2010

The new Blogger editor is great

In case you haven’t noticed: Blogger has a new editor (you have to explicitly enable it). And it’s great: No more window-global dialogs (e.g. to enter the URL of a link). True previews of a post. Bullet lists that can be indented. Breaks between the introduction and the main article. A resizable editor field. Shortcuts for applying font styles. Etc. In short, it fixes most of my complaints. A few wishes remain:

Touch support: WYSIWYG composing is not supported on iPhone/iPad and dragging the corner of the editor to resize does not work.

Navigation (between posting, settings, etc.) could be more streamlined: Some operations should be easier to reach, others should be harder to reach (e.g., how often do you change the design of your blog?), there is some clash between the tabs and the bar at the top, etc. Mozilla has performed a user interface study to solve this kind of problem for Firefox.

Paste without formatting should be the default.

Leftovers from my past wishes: Smart quotes, tables, inserting symbols, wider layouts (mentioned in the comments), paragraph styles (headings, pre, ...).

But, apart from that, I am very happy with the new editor.

السبت، 24 يوليو 2010

Teaching RDF

I recently held a 90min lecture on RDF. In it, I’ve followed the obvious path of explaining the usefulness of RDF by showing how it can be interpreted in several ways (set of triples, resources, graph). For a hands-on session, I needed a way to interactively create and query RDF, so I’ve added functionality to Hyena: In the “Query” zone, one can edit a graph in Turtle Syntax and query the repository with SPARQL. It turned out that there was a nice synergy between this zone and the rest of Hyena, because the encoded wiki pages plus attached tags provided nice “real-world” example data. As an exercise, I asked my audience to express in SPARQL the query “all wiki pages that are tagged with ‘Todo’”.

Update: More RDF shells

sparql-query: A shell for accessing SPARQL endpoints. [Source: Mischa Tuffield]

OntoWiki has an interactive query shell with SPARQL syntax highlightning, saved queries and other features. [Source: Sebastian Tramp]

The SparqlTrainer is an e-learning tool to practice SPARQL interactively. [Source: Sebastian Tramp]

الخميس، 22 يوليو 2010

Inconsistent information in your database

The blog post “Rethinking Form Validation” describes an interesting idea (apparently inspired by one of Alan Cooper’s books): While developers are fond of only storing information that is fully validated, it may help end users if they can store inconsistent data. Related examples include forms that force you to only enter digits for phone numbers (no spaces, dashes, parentheses, etc.) or some obvious characters being forbidden from passwords. Validation should be unobtrusive, because there are always going to be unforeseen cases where rigid control works against the user. Eclipse’s handling of Java syntax errors is exemplary: You are warned about them, but you are not prevented from continuing your work.

الأحد، 11 يوليو 2010

Use a single version number for Ant and Java (bonus: GWT)

Problem: If your application has a version number, it should be accessible during run time from Java (e.g., to display it in an “About this application” dialog) and during build time from Ant (e.g. to include it in file names). The solution is as follows.

Access the version from Java

Create the following properties file src/de/hypergraphs/hyena/core/client/bundle/BuildConstants.properties and put it into the class path.


    buildVersion=0.2.0

Access BuildConstants.properties as a Java resource. I usually construct the resource path relative to a Java class (a sibling of the file). That way the path to the properties file will always stay up-to-date, as long as I move the Java class with it.

Ant

Ant can read external property files as variable with the following statement.

    <property file="src/de/hypergraphs/hyena/core/client/bundle/BuildConstants.properties">

Additionally, you can insert the value of $buildVersion into a file while copying it, by using a filterset.

    <copy file="${data.dir}/index.html" todir="${version.dir}">
        <filterset>
            <!-- Replace @VERSION@ with the version -->
            <filter token="VERSION" value="${buildVersion}">
        </filterset>
    </copy>

GWT

For client-side GWT, you can use constants. Then the version number is compiled directly into the JavaScript code. To do so, you add the following interface as a sibling of BuildConstants.properties.

package de.hypergraphs.hyena.core.client.bundle;

    import com.google.gwt.i18n.client.Constants;

    public interface BuildConstants extends Constants {
        String buildVersion();
    }

الجمعة، 9 يوليو 2010

Running Tomcat on port 80 in a user account

If you already have a servlet container and also need a web server, there is usually no need to turn to a dedicated web server such as Apache. Instead, your servlet container can easily perform double duty, by putting your HTML files into the “ROOT” web application. If you run Tomcat on Linux, you have two choices: First, run it on a user account. Then you can only use “non-privileged” ports which start at 1024 (this is why Tomcat’s default is to use port 8080). Second, run it on a root account, but that poses security risks. There are many solutions out there for running Tomcat on port 8080 on a user account. The simplest solution that I have found is to use authbind. To do so, you need to perform the following steps:

Install authbind

Make port 80 available to authbind (you need to be root):

touch /etc/authbind/byport/80

chmod 500 /etc/authbind/byport/80

chown glassfish /etc/authbind/byport/80

Make IPv4 the default (authbind does not currently support IPv6). To do so, create the file TOMCAT/bin/setenv.sh with the following content:
CATALINA_OPTS="-Djava.net.preferIPv4Stack=true"

Change startup.sh

exec authbind --deep "$PRGDIR"/"$EXECUTABLE" start "$@" # OLD: exec "$PRGDIR"/"$EXECUTABLE" start "$@"

Flattr

الاثنين، 5 يوليو 2010

RDF (almost) is the next generation of relational databases

I love RDF and SPARQL, especially their elegance and simplicity. They surely deserve a lot more attention and not just as a formalism for ontologies and semantics, but also as a next step for relational databases. Especially with the “No SQL” movement becoming popular, RDF could be an alternative that builds on the achievements of the relational database community instead of shunning them. Note that the No SQL implementation Couch DB offers JavaScript-centricity and is a little bit simpler than RDF, so one might prefer it for some scenarios. On the other hand, RDF is not much more complicated and offers other features (composable data, standardized symbols, a general-purpose query language, etc.) that Couch DB cannot match. Alas, some of the basics are still complicated in RDF, such listing properties in a table. My paper “Using RDF for social information management” has more on this topic.

الجمعة، 2 يوليو 2010

Free text book on RDF (foundation of Semantic Web)

My dissertation is online. While many chapters are specific to the topic of the dissertation, some chapters of it should be very readable introductions to RDF and related ideas such as Linked Data. While RDF is the foundation of the Semantic Web, there are two communities using it:

RDF as a knowledge representation: This community is concerned with semantics, ontologies, etc.

RDF as data: This community uses RDF as next-generation relational database

The focus of this dissertation is (2). Recommended reading:

Part I: Background. Explains RDF, Linked data on the web, folksonomies, ontologies, schema and ontology languages.

Part VI: Related work. Mentions work that is related to Hyena: information management, hypertext, etc.

السبت، 19 يونيو 2010

How to display CVS log messages chronologically

Yes, I sometimes still use CVS. And I just found out that it is incapable of displaying the logged commit messages in chronological order. It can only show them for each file, with the resulting duplicates and all-around chaos if you do this for every file in a module. Thankfully, there is a way around this. A tool called cvs2cl.pl converts the cvs log output into something readable.

الجمعة، 18 يونيو 2010

Servlet sessions and automatic login: standard Java EE might not be enough for you

Java servlet session management works well for basic requirements, but has limits when it comes to advanced features:

There is no standard global view of all the sessions, since HttpSession.getSessionContext() has been deprecated. If you want access to all sessions, you have to set up your own registry.

You have relatively little control of when the session expires. For example, there is no standardized way of accessing the session cookie and extending its lifespan beyond browser restarts.

Any kind of server access keeps the session alive: Long-pull is still a common technique for sending events from the server to the client and prevents a session from being inactive.

These kinds of limitations become relevant when you need to implement automatic login. There, you have the following options:

Store user name and password in a cookie: This is inherently unsafe and should never be done.

Let the browser remember user name and password: Firefox does this, but only for forms the exist at page load time. It is thus very complicated to get to work for Ajax dialogs.

Keep the session around longer: One needs to control session timeout (after a given period of inactivity) and possibly cookie expiration (the session ID is normally removed once one quits the browser).

Simple solution, standard Java EE:

During login, ask for the period of time one should stay logged in (if there is no activity).

On the server, use HttpSession.setMaxInactiveInterval(). Beware: Some servlet containers seem to create a new session when this method is invoked.

Problems: (1) Long-polling is registered as usage. (2) You cannot extend session lifespan beyond the next browser termination (because the cookie with the session ID will be removed).

Comprehensive solution, manual session management:

Manage your sessions yourself. The client initially receives a session ID from the server and then sends it with each request to the server. The login security FAQ [1] has more details.

It would be interesting to integrate this kind of session management with Google Guice which currently supports servlet sessions via a dedicated scope.

الثلاثاء، 8 يونيو 2010

Google wants us to print to the cloud

This is an interesting announcement. In Google’s vision of the future, a printer can make itself available to the cloud and then be used from any internet-connected device. This includes cell phones, but also web applications. While I like the idea, I wonder if people will be willing to take the security risks. Discovering printing services in a LAN (via zero-configuration service discovery, as implemented by Apple’s Bonjour and Microsoft’s Universal Plug and Play) might be more palatable.

Update 2010-01-26: Google brings Cloud Print service to mobile Google Docs, Gmail

الاثنين، 3 مايو 2010

Facebook adopts RDFa

This is a big deal. RDFa allows one to search web pages as if they were databases. Think Google, but with additional options such as: “Show me movies that ...” or “Show me books written by ...” or even “Show me opinions on books written by ...”. With Facebook’s weight behind it, we will hopefully see wider adoption. Publishers that initially support Facebook’s Open Graph standard are IMDb, Microsoft, NHL, Posterous, Rotten Tomatoes, TIME, and Yelp. That already includes quite a bit of useful data. Well, at least as far as movies are concerned.

الأحد، 2 مايو 2010

Running a WAR as a desktop application

If you have written a web application, the next logical step is to make it available offline. The long-term solution is clear: You give your web application an offline mode, which will hopefully be complemented by explicit application management in web browsers. Short- to mid-term, though, that is often not feasible, because the server provides crucial functionality to the client. Thus, I was looking for a different solution.

One option I had seen was Hudson’s self-executable WAR file: When you execute java -jar hudson.war, an embedded web server starts and you can immediately try out Hudson. This approach had two limitations that I didn’t like. First, I wanted my solution to be acceptable for end users, so I wanted a graphical user interface. Second, the embedded web server extracted the WAR file to a temporary directory and did so again for each startup, wasting time. In contrast, my solution does the following things:

The application is a JAR file, the WAR is embedded inside. I opted against a single binary for JAR and WAR, because packaging the JAR as a nice desktop application adds platform-specific data, anyway, and makes it often impossible to deploy the result as a WAR.

When the JAR starts up, a Swing user interface is shown. If this hasn’t been done before, the WAR file is extracted to a directory next to the JAR file. The idea is that there is a dedicated folder for the application in which the JAR resides. The location of the JAR is determined via this trick. The WAR file (=ZIP format) in the classpath is extracted to the file system using the standard Java API.

Next, a web server is started and pointed to the web application directory in the file system (none of the web servers I’ve seen is able to serve a WAR file directly, let alone one that is embedded inside a JAR file). I used Winstone, because it is so small. There are even smaller ones, but those won’t allow you to use servlets. Obviously, any embeddable Java servlet container will do.

A button allows the user to open the starting page of the web application in the default web browser. java.awt.Desktop (Java 6) allows you to do this. Currently, localhost and a fixed port is used. In the future, the port should be configurable and one could maybe automatically switch to a different port if the default one is occupied.

Finally, I also produced a Mac OS X application. Such an application is a folder with the file name extension “.app” and the JAR file inside. This has the nice side effect of hiding the extracted WAR directory.

This is all there is to it. Now even non-technical end users can try out my web application without having to go through the steps of downloading and installing a web server. You can download the Eclipse project of my implementation (be warned, the code is quite experimental).

الخميس، 29 أبريل 2010

Finding out where your class files are

Sometimes, when you are writing a program, you need to install data (e.g. by unzipping example files) or to find additional data (e.g. plugins). If you know the location of your class files, you can put the data close to them or start your search there. Your program will usually be packaged as a JAR file, so this scheme allows you to keep everything in the same directory. But any solution should also work for unarchived files, because you want to test while developing your (un-jarred) code. Until now, I’ve always used hacks involving Class.getResource() and looking for exclamation marks in the returned URL to find out if the program is currently running from a JAR. Recently, I’ve discovered a simpler solution (source):

ProtectionDomain protectionDomain = HyenaDesk.class.getProtectionDomain();

File codeLoc = new File(protectionDomain.getCodeSource().getLocation().getFile());

For example, this code computes the following two values for codeLoc.

Unarchived: /home/rauschma/workspace/hyena_desk/bin

JAR: /home/rauschma/workspace/hyena_desk/build/hyena_desk.jar

الأربعاء، 28 أبريل 2010

Mac OS Eclipse: preventing GUI tree memory

Since Mac OS Eclipse migrated from Carbon to Cocoa, the tree graphical widgets have changed their behavior. Previously, if you had a subtree that was deeply expanded, if you closed and re-opened its root, only its children would be visible, collapsed. With Cocoa, all of its descendants appear in the same fashion as before. This makes it difficult to clean up deep expansions.

Fortunately, there is a way to get the old behavior back, via a key combination: Right and left arrows without any modifier key open and close a folder. If you hold the option key, the right arrow expands everything (=be careful with that in the Eclipse Package Explorer) and the left arrow collapses everything. The next time you open a node you have open-left-closed, only its children are visible.