sobota 18. februára 2012

JAXB and Commons pool

Lately I made a mistake, which went unnoticed for quite a long time. In an effort to improve performance of JAXB (Java Architecture for XML Binding) operations, I cached instances of javax.xml.bind.Marshaller and javax.xml.bind.Unmarshaller. This article explains why this was not a good idea and describes how pooling with Apache Commons Pool can be used instead, to improve overall JAXB performance.


JAXB

JAXB API is fairly verbose, however when working with XML we generally do not want to create SchemaFactory or JAXBContext... What we really need is just one method to marshal object into XML string and second to unmarshal string to object. This goal is described by JaxbHelper interface. When parameter schemaLocation is present XML is validated against XSD schema, in case it is null validation is not performed.

/**
 * Custom interface, which simplifies JAXB API.
 */
public interface JaxbHelper {

    public <T> String marshal(T instance, @Nullable String schemaLocation) throws Exception;

    public <T> T unmarshal(String xml, Class<T> clazz, @Nullable String schemaLocation) throws Exception;
}

SimpleJaxbHelper

Let's start with the simplest possible implementation. It will serve later as a unit of measurement for comparing performance. This and all following JaxbHelper implementations are thread-safe and re-entrant. Parameter schemaLocation is relative to the class, so in case XSD schema is in the same package as class, then name of schema is sufficient and path can be omitted.

/**
 * Simplest possible implementation, does not use cache nor pooling. It is thread-safe and re-entrant.
 */
public class SimpleJaxbHelper implements JaxbHelper {

    @Override
    public <T> String marshal(T instance, @Nullable String schemaLocation) throws JAXBException, SAXException {
        StringWriter result = new StringWriter();

        JAXBContext jaxbContext = JAXBContext.newInstance(instance.getClass());

        Marshaller marshaller = jaxbContext.createMarshaller();

        if (schemaLocation != null) {
            SchemaFactory schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
            Schema schema = schemaFactory.newSchema(instance.getClass().getResource(schemaLocation));
            marshaller.setSchema(schema);
        }

        marshaller.marshal(instance, result);

        return result.toString();
    }

    @Override
    public <T> T unmarshal(String xml, Class<T> clazz, @Nullable String schemaLocation) throws JAXBException, SAXException {
        JAXBContext jaxbContext = JAXBContext.newInstance(clazz);

        Unmarshaller unmarshaller = jaxbContext.createUnmarshaller();

        if (schemaLocation != null) {
            SchemaFactory schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
            Schema schema = schemaFactory.newSchema(clazz.getResource(schemaLocation));
            unmarshaller.setSchema(schema);
        }

        //noinspection unchecked
        return (T) unmarshaller.unmarshal(new StringReader(xml));
    }

CachedJaxbHelper

Second implementation uses cache for javax.xml.bind.JAXBContext instances which are thread-safe, at least in JAXB RI implementation. It caches instances of javax.xml.validation.Schema as well, because these are immutable and there is really no reason why not to do so. Notice that java.util.concurrent.ConcurrentHashMap is used here because it's get method generally does not block, and may overlap with put method

/**
 * Implementation which holds it's JAXBContext and Schema instances in a map. It is thread-safe and re-entrant.
 */
public class CachedJaxbHelper implements JaxbHelper {

    private static Map<Class, JAXBContext> jaxbContextMap = new ConcurrentHashMap<Class, JAXBContext>();
    private static Map<String, Schema> schemaMap = new ConcurrentHashMap<String, Schema>();

    private static JAXBContext getJaxbContext(Class clazz) throws JAXBException {
        JAXBContext jaxbContext = jaxbContextMap.get(clazz);

        if (jaxbContext == null) {
            jaxbContext = JAXBContext.newInstance(clazz);
            jaxbContextMap.put(clazz, jaxbContext);
        }
        return jaxbContext;
    }

    private static Schema getSchema(Class clazz, String schemaLocation) throws JAXBException, SAXException {
        Schema schema = schemaMap.get(schemaLocation);

        if (schema == null) {
            SchemaFactory schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
            schema = schemaFactory.newSchema(clazz.getResource(schemaLocation));
            schemaMap.put(schemaLocation, schema);
        }
        return schema;
    }

    @Override
    public <T> String marshal(T instance, @Nullable String schemaLocation) throws JAXBException, SAXException {
        StringWriter result = new StringWriter();

        JAXBContext jaxbContext = getJaxbContext(instance.getClass());

        Marshaller marshaller = jaxbContext.createMarshaller();

        if (schemaLocation != null) {
            Schema schema = getSchema(instance.getClass(), schemaLocation);
            marshaller.setSchema(schema);
        }

        marshaller.marshal(instance, result);

        return result.toString();
    }

    @Override
    public <T> T unmarshal(String xml, Class<T> clazz, @Nullable String schemaLocation) throws JAXBException, SAXException {
        JAXBContext jaxbContext = getJaxbContext(clazz);

        Unmarshaller unmarshaller = jaxbContext.createUnmarshaller();

        if (schemaLocation != null) {
            Schema schema = getSchema(clazz, schemaLocation);
            unmarshaller.setSchema(schema);
        }

        //noinspection unchecked
        return (T) unmarshaller.unmarshal(new StringReader(xml));
    }
}

PooledJaxbHelper

Third implementation leaves Schema instances cached as before, but uses pooling for javax.xml.bind.JAXBContext, javax.xml.bind.Marshaller and javax.xml.bind.Unmarshaller. First to notice in PooledJaxbHelper is PoolKey, this inner class serves as key for marshaller, unmarshaller pools. It encapsulates class and schema location, schema location may be null so equals and hashCode methods must be generated accordingly.

Commons pool is very easy to work with, it contains org.apache.commons.pool.impl.GenericKeyedObjectPool which can hold instances relative to a key. It must be provided with factories which can create pooled objects. JaxbContextFactory is responsible for creating new JAXBContext instances where MarshallerFactory and UnmarshallerFactory are responsible for creating Marshaller and Unmarshaller instances. MarshallerFactory and UnmarshallerFactory already use jaxbContextPool for borrowing JAXBContext instance. Every object borrowed from pool with borrowObject method must be returned with returnObject method. Pool may be provided with optional GenericKeyedObjectPool.Config to change default configuration. I decided to invalidate Marshaller and Unmarshaller instances when exception happens with invalidateObject method, so that this instance is not to be used again.

/**
 * Implementation which holds it's Schema instances in a map, and uses pooling for JAXBContext, Marshaller and Unmarshaller instances.
 * It is thread-safe and re-entrant.
 */
public class PooledJaxbHelper implements JaxbHelper {

    private static class PoolKey {
        private Class clazz;
        private String schemaLocation;

        private PoolKey(Class clazz, @Nullable String schemaLocation) {
            this.clazz = clazz;
            this.schemaLocation = schemaLocation;
        }

        @Override
        public boolean equals(Object o) {
            if (this == o) return true;
            if (o == null || getClass() != o.getClass()) return false;

            PoolKey poolKey = (PoolKey) o;

            return clazz.equals(poolKey.clazz) && !(schemaLocation != null ? !schemaLocation.equals(poolKey.schemaLocation) : poolKey.schemaLocation != null);

        }

        @Override
        public int hashCode() {
            int result = clazz.hashCode();
            result = 31 * result + (schemaLocation != null ? schemaLocation.hashCode() : 0);
            return result;
        }

        public Class getClazz() {
            return clazz;
        }

        @Nullable
        public String getSchemaLocation() {
            return schemaLocation;
        }
    }

    private static class MarshallerFactory extends BaseKeyedPoolableObjectFactory<PoolKey, Marshaller> {
        @Override
        public Marshaller makeObject(PoolKey key) throws Exception {
            JAXBContext jaxbContext = jaxbContextPool.borrowObject(key.getClazz());

            Marshaller marshaller = jaxbContext.createMarshaller();

            if (key.getSchemaLocation() != null) {
                Schema schema = getSchema(key);
                marshaller.setSchema(schema);
            }

            jaxbContextPool.returnObject(key.getClazz(), jaxbContext);

            return marshaller;
        }
    }

    private static class UnmarshallerFactory extends BaseKeyedPoolableObjectFactory<PoolKey, Unmarshaller> {
        @Override
        public Unmarshaller makeObject(PoolKey key) throws Exception {
            JAXBContext jaxbContext = jaxbContextPool.borrowObject(key.getClazz());

            Unmarshaller unmarshaller = jaxbContext.createUnmarshaller();

            if (key.getSchemaLocation() != null) {
                Schema schema = getSchema(key);
                unmarshaller.setSchema(schema);
            }

            jaxbContextPool.returnObject(key.getClazz(), jaxbContext);

            return unmarshaller;
        }
    }

    private static class JaxbContextFactory extends BaseKeyedPoolableObjectFactory<Class, JAXBContext> {
        @Override
        public JAXBContext makeObject(Class clazz) throws Exception {
            return JAXBContext.newInstance(clazz);
        }
    }

    private static class CustomPoolConfig extends GenericKeyedObjectPool.Config {
        {
            maxIdle = 3;
            maxActive = 10;
            maxTotal = 100;
            minIdle = 1;
            whenExhaustedAction = GenericKeyedObjectPool.WHEN_EXHAUSTED_GROW;
            timeBetweenEvictionRunsMillis = 1000L * 60L * 10L;
            numTestsPerEvictionRun = 50;
            minEvictableIdleTimeMillis = 1000L * 60L * 5L; // 30 min.
        }
    }

    private static Schema getSchema(PoolKey poolKey) throws JAXBException, SAXException {
        Schema schema = schemaMap.get(poolKey.getSchemaLocation());

        if (schema == null) {
            SchemaFactory schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
            schema = schemaFactory.newSchema(poolKey.getClazz().getResource(poolKey.getSchemaLocation()));
            schemaMap.put(poolKey.getSchemaLocation(), schema);
        }
        return schema;
    }

    private static Map<String, Schema> schemaMap = new ConcurrentHashMap<String, Schema>();
    private static GenericKeyedObjectPool<Class, JAXBContext> jaxbContextPool = new GenericKeyedObjectPool<Class, JAXBContext>(new JaxbContextFactory(), new CustomPoolConfig());
    private static GenericKeyedObjectPool<PoolKey, Marshaller> marshallerPool = new GenericKeyedObjectPool<PoolKey, Marshaller>(new MarshallerFactory(), new CustomPoolConfig());
    private static GenericKeyedObjectPool<PoolKey, Unmarshaller> unmarshallerPool = new GenericKeyedObjectPool<PoolKey, Unmarshaller>(new UnmarshallerFactory(), new CustomPoolConfig());

    @Override
    public <T> String marshal(T instance, @Nullable String schemaLocation) throws Exception {
        StringWriter result = new StringWriter();

        PoolKey poolKey = new PoolKey(instance.getClass(), schemaLocation);
        Marshaller marshaller = marshallerPool.borrowObject(poolKey);

        try {
            marshaller.marshal(instance, result);

            marshallerPool.returnObject(poolKey, marshaller);

            return result.toString();
        } catch (Exception e) {
            marshallerPool.invalidateObject(poolKey, marshaller);
            throw new RuntimeException(e);
        }
    }

    @Override
    public <T> T unmarshal(String xml, Class<T> clazz, @Nullable String schemaLocation) throws Exception {
        T result;

        PoolKey poolKey = new PoolKey(clazz, schemaLocation);
        Unmarshaller unmarshaller = unmarshallerPool.borrowObject(poolKey);

        try {
            //noinspection unchecked
            result = (T) unmarshaller.unmarshal(new StringReader(xml));

            unmarshallerPool.returnObject(poolKey, unmarshaller);

            return result;
        } catch (Exception e) {
            unmarshallerPool.invalidateObject(poolKey, unmarshaller);
            throw new RuntimeException(e);
        }
    }
}

Conclusion

As for performance there are many factors which must be taken into consideration, like for example complexity of XML documents, pool configuration, number of processors, size of memory and so on... Therefore absolute numbers do not have any meaning here, but here are some relative results which seem to be consistent enough.

INFO  JaxbHelperTest - testCompareAllToSimple
INFO  JaxbHelperTest - SimpleJaxbHelper / CachedJaxbHelper ratio: 5.813814804912555
INFO  JaxbHelperTest - SimpleJaxbHelper / PooledJaxbHelper ratio: 14.213429365043625

INFO  JaxbHelperTest - testCompareAllToSimpleMultipleThreads
INFO  JaxbHelperTest - SimpleJaxbHelper / CachedJaxbHelper ratio: 7.6223063332965735
INFO  JaxbHelperTest - SimpleJaxbHelper / PooledJaxbHelper ratio: 9.20335326080807

Which basically says that CachedJaxbHelper is 6 to 8 times faster than SimpleJaxbHelper and PooledJaxbHelper is 9 to 14 times faster then SimpleJaxbHelper.


Appendix

Project structure
JaxbPool/
|-- pom.xml
|-- src
    |-- main
    |   |-- java
    |   |   `-- eu
    |   |       `-- zont
    |   |           `-- jaxbpool
    |   |               |-- core
    |   |               |   |-- CachedJaxbHelper.java
    |   |               |   |-- JaxbHelper.java
    |   |               |   |-- PooledJaxbHelper.java
    |   |               |   `-- SimpleJaxbHelper.java
    |   |               `-- xml
    |   |                   |-- ObjectFactory.java
    |   |                   |-- PersonType.java
    |   |                   `-- SampleType.java
    |   `-- resources
    |       |-- eu
    |       |   `-- zont
    |       |       `-- jaxbpool
    |       |           `-- xml
    |       |               `-- sample.xsd
    |       |-- log4j.dtd
    |       `-- log4j.xml
    `-- test
        |-- java
        |   `-- eu
        |       `-- zont
        |           `-- jaxbpool
        |               `-- core
        |                   `-- JaxbHelperTest.java
        `-- resources
            `-- eu
                `-- zont
                    `-- jaxbpool
                        `-- xml
                            `-- sample.xml
pom.xml
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>eu.zont.jaxbpool</groupId>
    <artifactId>jaxb-pool</artifactId>
    <version>1.0-SNAPSHOT</version>

    <dependencies>
        <dependency>
            <groupId>commons-pool</groupId>
            <artifactId>commons-pool</artifactId>
            <version>1.6</version>
        </dependency>
        <dependency>
            <groupId>org.kohsuke.jetbrains</groupId>
            <artifactId>annotations</artifactId>
            <version>9.0</version>
        </dependency>
        <dependency>
            <groupId>log4j</groupId>
            <artifactId>log4j</artifactId>
            <version>1.2.16</version>
        </dependency>
        <dependency>
            <groupId>junit</groupId>
            <artifactId>junit</artifactId>
            <version>4.10</version>
            <scope>test</scope>
        </dependency>
    </dependencies>
</project>
Tests
public class JaxbHelperTest {

    private static final Logger log = Logger.getLogger(JaxbHelperTest.class);

    private static final String SAMPLE_SCHEMA_LOCATION = "sample.xsd";
    private static final int TO_MILLISECONDS = 1000000;
    private static final int HEAVY_LOAD = 1000;
    private static final int NUM_THREADS = 10;


    private SampleType createSample() {
        ObjectFactory objectFactory = new ObjectFactory();

        SampleType sample = objectFactory.createSampleType();

        PersonType person = objectFactory.createPersonType();
        person.setFirstname("firstname");
        person.setSurname("surname");

        sample.getPerson().add(person);

        return sample;
    }

    private void testJaxbHelper(JaxbHelper jaxbHelper) throws Exception {
        SampleType sample = createSample();

        String sampleXml = jaxbHelper.marshal(sample, SAMPLE_SCHEMA_LOCATION);
        SampleType sampleCopy = jaxbHelper.unmarshal(sampleXml, SampleType.class, SAMPLE_SCHEMA_LOCATION);

        assertNotNull(sampleXml);
        assertNotNull(sampleCopy);
        assertEquals(sample.getPerson().size(), sampleCopy.getPerson().size());
        assertEquals(sample.getPerson().get(0).getFirstname(), sampleCopy.getPerson().get(0).getFirstname());
        assertEquals(sample.getPerson().get(0).getSurname(), sampleCopy.getPerson().get(0).getSurname());
    }

    @Test
    public void testSimpleJaxbHelper() throws Exception {
        testJaxbHelper(new SimpleJaxbHelper());
    }

    @Test
    public void testCachedJaxbHelper() throws Exception {
        testJaxbHelper(new CachedJaxbHelper());
    }

    @Test
    public void testPooledJaxbHelper() throws Exception {
        testJaxbHelper(new PooledJaxbHelper());
    }

    private long testJaxbHelperLoad(JaxbHelper jaxbHelper, int load) throws Exception {
        SampleType sample = createSample();

        long startTime = System.nanoTime();

        for (int i = 0; i < load; i++) {
            String sampleXml = jaxbHelper.marshal(sample, SAMPLE_SCHEMA_LOCATION);
            jaxbHelper.unmarshal(sampleXml, SampleType.class, SAMPLE_SCHEMA_LOCATION);
        }

        return System.nanoTime() - startTime;
    }

    @Test
    public void testSimpleJaxbHelperHeavyLoad() throws Exception {
        long estimatedTime = testJaxbHelperLoad(new SimpleJaxbHelper(), HEAVY_LOAD);
        log.info("SimpleJaxbHelper estimated time: " + estimatedTime / TO_MILLISECONDS);
    }

    @Test
    public void testCachedJaxbHelperHeavyLoad() throws Exception {
        long estimatedTime = testJaxbHelperLoad(new CachedJaxbHelper(), HEAVY_LOAD);
        log.info("CachedJaxbHelper estimated time: " + estimatedTime / TO_MILLISECONDS);
    }

    @Test
    public void testPooledJaxbHelperHeavyLoad() throws Exception {
        long estimatedTime = testJaxbHelperLoad(new PooledJaxbHelper(), HEAVY_LOAD);
        log.info("PooledJaxbHelper estimated time: " + estimatedTime / TO_MILLISECONDS);
    }

    @Test
    public void testCompareAllToSimple() throws Exception {
        long simpleEstimatedTime = testJaxbHelperLoad(new SimpleJaxbHelper(), HEAVY_LOAD);
        long cachedEstimatedTime = testJaxbHelperLoad(new CachedJaxbHelper(), HEAVY_LOAD);
        long pooledEstimatedTime = testJaxbHelperLoad(new PooledJaxbHelper(), HEAVY_LOAD);

        log.info("testCompareAllToSimple");
        log.info("SimpleJaxbHelper / CachedJaxbHelper ratio: " + (double) simpleEstimatedTime / (double) cachedEstimatedTime);
        log.info("SimpleJaxbHelper / PooledJaxbHelper ratio: " + (double) simpleEstimatedTime / (double) pooledEstimatedTime);
    }


    private class JaxbTask implements Callable<Long> {
        private JaxbHelper jaxbHelper;
        private int load;

        public JaxbTask(JaxbHelper jaxbHelper, int load) {
            this.jaxbHelper = jaxbHelper;
            this.load = load;
        }

        public Long call() throws Exception {
            return testJaxbHelperLoad(jaxbHelper, load);
        }
    }


    private long testJaxbHelperHeavyLoadMultipleThreads(JaxbHelper jaxbHelper, int numThreads, int load) throws Exception {
        long estimatedTime = 0;

        ExecutorService threadExecutor = Executors.newFixedThreadPool(numThreads);

        List<JaxbTask> taskList = new ArrayList<JaxbTask>(numThreads);

        for (int i = 0; i < numThreads; i++) {
            taskList.add(new JaxbTask(jaxbHelper, load / numThreads));
        }

        List<Future<Long>> results = threadExecutor.invokeAll(taskList);

        threadExecutor.shutdown();

        boolean finished = threadExecutor.awaitTermination(1, TimeUnit.MINUTES);

        if (finished) {
            for (Future<Long> result : results) {
                estimatedTime += result.get();
            }
        } else {
            fail("Some of the test threads failed to finish correctly.");
        }

        return estimatedTime;
    }

    @Test
    public void testCompareSimpleAndPooledMultipleThreads() throws Exception {
        long simpleEstimatedTime = testJaxbHelperHeavyLoadMultipleThreads(new SimpleJaxbHelper(), NUM_THREADS, HEAVY_LOAD);
        long pooledEstimatedTime = testJaxbHelperHeavyLoadMultipleThreads(new PooledJaxbHelper(), NUM_THREADS, HEAVY_LOAD);

        log.info("SimpleJaxbHelper / PooledJaxbHelper ratio: " + (double) simpleEstimatedTime / (double) pooledEstimatedTime);
    }

    @Test
    public void testCompareSimpleAndCachedMultipleThreads() throws Exception {
        long simpleEstimatedTime = testJaxbHelperHeavyLoadMultipleThreads(new SimpleJaxbHelper(), NUM_THREADS, HEAVY_LOAD);
        long cachedEstimatedTime = testJaxbHelperHeavyLoadMultipleThreads(new CachedJaxbHelper(), NUM_THREADS, HEAVY_LOAD);

        log.info("SimpleJaxbHelper / CachedJaxbHelper ratio: " + (double) simpleEstimatedTime / (double) cachedEstimatedTime);
    }

    @Test
    public void testCompareAllToSimpleMultipleThreads() throws Exception {
        long simpleEstimatedTime = testJaxbHelperHeavyLoadMultipleThreads(new SimpleJaxbHelper(), NUM_THREADS, HEAVY_LOAD);
        long cachedEstimatedTime = testJaxbHelperHeavyLoadMultipleThreads(new CachedJaxbHelper(), NUM_THREADS, HEAVY_LOAD);
        long pooledEstimatedTime = testJaxbHelperHeavyLoadMultipleThreads(new PooledJaxbHelper(), NUM_THREADS, HEAVY_LOAD);

        log.info("testCompareAllToSimpleMultipleThreads");
        log.info("SimpleJaxbHelper / CachedJaxbHelper ratio: " + (double) simpleEstimatedTime / (double) cachedEstimatedTime);
        log.info("SimpleJaxbHelper / PooledJaxbHelper ratio: " + (double) simpleEstimatedTime / (double) pooledEstimatedTime);
    }
}

nedeľa 4. septembra 2011

Párovacie algoritmy

K napísaniu tohoto príspevku ma priviedla potreba prepísať kus kódu tak aby bol rýchlejší. Keďže som sa už s podobným problémom stretol viackrát, tak ho považujem za celkom všedný, ale nechcem ho popisovať všeobecne, preto som si vymyslel príklad s faktúrami a platbami.

Všeobecný popis problému by znel asi takto: Máme dve množiny objektov a potrebujeme priradiť k objektu z prvej množiny objekt z druhej množiny na základe určitých kritérií a vlastností týchto objektov.

Inými slovami: Máme množinu faktúr a množinu platieb, pričom predpokladáme, že dáta prišli z externých systémov a nemáme možnosť ovplyvniť ich štruktúru ani poradie.


Faktúra obsahuje číslo faktúry a peňažnú sumu, prípadne iné vlastnosti, ktoré nás nezaujímajú.

public class Faktura {

private String cisloFaktury;
private BigDecimal suma;

// ...konstruktor, gettery...pripadne dalsie vlastnosti, s ktorymi nepracujeme...
}


Platba obsahuje variabilný symbol a sumu, prípadne iné vlastnosti, ktoré nás nezaujímajú.

public class Platba {

 private String variabilnySymbol;
 private BigDecimal suma;

 // ...konstruktor, gettery...pripadne dalsie vlastnosti, s ktorymi nepracujeme...
}


V prípade, že číslo faktúry sa rovná variabilnému symbolu platby a zároveň sa rovnajú aj peňažné sumy, tak faktúra a platba tvoria pár. Predpokladáme, že jedna faktúra bude zaplatená iba jednou platbou a naopak. Predpokladáme, že môžu existovať faktúry po splatnosti, ktoré neboli zaplatené a neexistujú pre ne platby. A zároveň môžu existovať platby, ktoré sa netýkajú faktúr.

Skutočnosť, že faktúra a platba tvoria pár vyjadríme triedou Par.

public class Par {

  private Faktura faktura;
  private Platba platba;

  // ...konstruktor, gettery...
}


Riešenie, ktoré hľadáme bude mať na vstupe množinu faktúr a platieb a výstupom bude množina párov.

public interface ParovanieRiesenie {

  /**
   * Metoda sparuje faktury s platbami podla urcitych kriterii.
   *
   * @param faktury mnozina faktur, nie vsetky faktury musia byt zaplatene, tj. mozu existovat aj take, ktore nieje mozne sparovat
   * @param platby mnozina platieb, nie pre vsetky platby musi existovat faktura, tj. mozu existovat aj take, ktore nieje mozne sparovat
   * @return mnozina parov, par tvori faktura a platba, pricom predpokladame, ze jedna faktura moze mat iba jednu platbu a naopak
   */
  public Set<Par> parovanie(Set<Faktura> faktury, Set<Platba> platby);
}


Riešenie 1.

Najjednoduchšie riešenie, ktoré asi napadne každého ako prvé, je cyklus v cykle. Funguje tak, že vo vonkajšom cykle prechádzame všetky faktúry a vo vnútornom platby. Keď narazíme na zhodu, tak vytvoríme pár a vyskočíme z vnútorného cyklu. (predpoklad jedna faktúra môže mať iba jednu platbu a naopak)

public class RiesenieCyklus implements ParovanieRiesenie {

  public Set<Par> parovanie(Set<Faktura> faktury, Set<Platba> platby) {

      Set<Par> vysledok = new HashSet<Par>();

      for (Faktura faktura : faktury) {
          for (Platba platba : platby) {
              if (porovnanie(faktura, platba)) {
                  vysledok.add(new Par(faktura, platba));
                  break; // predpokladame, ze jedna faktura moze mat len jednu platbu
              }
          }
      }

      return vysledok;
  }

  /**
   * Porovna fakturu a platbu opdla urcitych kriterii.
   *
   * @param faktura na porovnanie
   * @param platba  na porovnanie
   * @return vrati true ak sa cislo faktury rovna variabilnemu symbolu platby a zaroven sa zhoduju sumy, inak false.
   */
  private boolean porovnanie(Faktura faktura, Platba platba) {
      return faktura.getCisloFaktury().equals(platba.getVariabilnySymbol())
              && faktura.getSuma().equals(platba.getSuma());
  }
}


Zrejmou výhodou takéhoto riešenia je jednoduchosť implementácie, ktokoľvek sa na to pozrie, hneď vie o čo ide. Horšie je to už s výkonom. Za predpokladu, že máme N faktúr a M platieb a nieje možné spárovať ani jednu platbu s faktúrou, tak sa telo vnútorného cyklu vykoná M*N krát. Pričom je predpoklad, že metóda porovnaj bude v skutočnosti komplexnejšia.


Riešenie 2.

Alternatívne riešenie, ktoré by som chcel opísať, spočíva v tom, že sa vyrobí spoločný kľúč pre faktúry aj platby. Pomocou tohoto kľúča sa v jednom cykle vložia do HashMap-y faktúry. A v druhom cykle, ktorý nieje vnorený, sa prehľadáva mapa pomocou kľúča vytvoreného z platieb. Ja som zvolil implementáciu kľúča tak, že obsahuje String, v ktorom je zakódovaná informácia z platby, alebo faktúry, pričom metódy equals a hashcode sú vygenerované nad týmto String-om.

public class RiesenieMapa implements ParovanieRiesenie {


  public Set<Par> parovanie(Set<Faktura> faktury, Set<Platba> platby) {

      Set<Par> vysledok = new HashSet<Par>();

      Map<Kluc, Faktura> mapaFaktury = new HashMap<Kluc, Faktura>();

      for (Faktura faktura : faktury) {
          mapaFaktury.put(new Kluc(faktura), faktura);
      }

      for (Platba platba : platby) {
          Faktura faktura = mapaFaktury.get(new Kluc(platba));

          if (faktura != null) {
              vysledok.add(new Par(faktura, platba));
          }
      }

      return vysledok;
  }

  private class Kluc {
      private static final String SEPARATOR = "_";

      private String kluc;

      /**
       * Vytvori kluc reprezentovany stringom v tvare "cislo faktury" + "separator" + "suma faktury".
       * <p/>
       * napr.: "000001_1000"
       *
       * @param faktura z ktorej sa vytvara kluc
       */
      private Kluc(Faktura faktura) {
          StringBuilder sb = new StringBuilder();

          sb.append(faktura.getCisloFaktury());
          sb.append(SEPARATOR);
          sb.append(faktura.getSuma().toPlainString());

          this.kluc = sb.toString();
      }

      /**
       * Vytvori kluc reprezentovany stringom v tvare "variabilny symbol" + "separator" + "suma platby".
       * <p/>
       * napr.: "000001_1000"
       *
       * @param platba z ktorej sa vytvara kluc
       */
      private Kluc(Platba platba) {
          StringBuilder sb = new StringBuilder();

          sb.append(platba.getVariabilnySymbol());
          sb.append(SEPARATOR);
          sb.append(platba.getSuma().toPlainString());

          this.kluc = sb.toString();
      }

      /**
       * Generovana metoda equals, ktora berie do uvahy vlastnost kluc.
       */
      @Override
      public boolean equals(Object o) {
          if (this == o) return true;
          if (o == null || getClass() != o.getClass()) return false;

          Kluc that = (Kluc) o;

          return kluc.equals(that.kluc);
      }

      /**
       * Generovana metoda hashCode, ktora berie do uvahy vlastnost kluc.
       */
      @Override
      public int hashCode() {
          return kluc.hashCode();
      }
  }
}


Výhodou tohoto riešenia je, že kód pre vytvorenie kľúča sa opakuje iba M+N krát. Samozrejme nie vždy je možné všetky podmienky zakódovať priamo do String-u. Tiež považujem za výhodu, že výkon algoritmu nieje až do takej veľkej miery ovplyvnený vstupnými dátami a teda je oveľa jednoduchšie ho predpovedať.

Možno by ešte stálo za zamyslenie ako by sa dala táto úloha efektívne rozdeliť aby sa mohla vykonávať paralelne, ale to už nechám na čitateľovi.

piatok 27. mája 2011

Pre zlepšenie nálady :)

Úsmevný pokus o dôkaz, že Matrix bol naprogramovaný v Jave inšpirovaný Disgruntled Bomb.
package there.is.no.spoon;

import java.lang.reflect.Field;

public class Matrix {

    static {
        dejaVu("value", "spoon", "wake up".toCharArray());
        dejaVu("count", "spoon", "wake up".length());
    }

    private static void dejaVu(String fieldName, String ref, Object value) {
        try {
            Field field = fieldName.getClass().getDeclaredField(fieldName);
            field.setAccessible(true);
            field.set(ref, value);
        } catch (Exception e) {
            // There is no spoon, wake up. :)
        }
    }

    public static void main(String... args) {

        System.out.println("spoon");
    }
}
Inšpirované: http://thedailywtf.com/Articles/Disgruntled-Bomb-Java-Edition.aspx

piatok 25. septembra 2009

How to deal with password hash in JBoss Seam

Recently I tried to clarify some points concerning authentication part of Seam security. Here can be found forum thread, where I gave some advice on this subject and which was inspiration for this blog entry: How to persist user?

So what it is all about? Seam provides a standard API for the management of a Seam application's users and roles, called Identity Management. In order to use Seam's Identity Management it must be configured with Identity Store. Out of the box Seam provides Jpa Identity Store which stores users and roles in database.

Apparently developer asking for help discovered identityManager component, which can be used to create new users, however in order to use this component user must be already authenticated and additionally must posses some permissions. (Permissions and authorization in general are not covered in this article.)

But this presents CATCH XXII situation when there is an empty database and there is no possibility to log in. In short, developer can not use identityManager.createUser(String name, String password) method because he can not log in and he can not log in because there are no user accounts in database to log in with.

Well no problem, developers can populate their brand new database with SQL scripts manually or in case of Seam, import-dev.sql file can be employed to do the job. But still there is one catch, Seam can automatically hash passwords in database and it is not trivial to reproduce this manually.

This article explains first shortly how to set up Jpa Identity Store using hashed password with salt and then how gather all information needed to create SQL script, which inserts first user into database.

Firstly Jpa Identity Store must be configured in components.xml descriptor like so:
<security:identity-manager identity-store="#{jpaIdentityStore}"/>
<security:jpa-identity-store 
user-class="com.acme.model.User"
role-class="com.acme.model.Role"/>
Where User class is Entity annotated with seam annotations like so:
(only getters are shown, fields and setters are omitted for clarity)
@Entity
public class User implements Serializable {
...
 @NotNull
 @UserPrincipal
 @Column(unique = true)
 public String getUsername() { return username; }

 @UserPassword
 public String getPasswordHash() { return passwordHash; }

 @PasswordSalt
 public String getSalt() { return salt; }

 @UserEnabled
 public boolean isEnabled() { return enabled; }

 @UserFirstName
 public String getFirstname() { return firstname; }

 @UserLastName
 public String getLastname() { return lastname; }

 @UserRoles
 @ManyToMany
 public List<UserRole> getRoles() { return roles; }
...
}
Most noticeable annotations are @UserPrincipal, @UserPassword and @PasswordSalt.
In field annotated @UserPrincipal username is stored.
In field annotated @PasswordSalt salt is stored.
In field annotated @UserPassword hashed password is stored.
Password hash depends on these parameters:

passwordHash = f(password, algorithm, salt, iterations)

Where:

password - plain password
algorithm - hashing algorithm, in Seam 2.2.0.GA, HmacSHA1 is default
salt - salt value used to produce the password hash, when not specified field annotated with @UserPrincipal is used (username is used as salt for hashing password)
iterations - number of iterations for generating the password hash, default value is 1000

Solution:

I have created small test which does the trick, it uses Seam PasswordHash component and it produces valid password, salt and hash combination which can be used in SQL script. This code was tested only with Seam 2.2.0.GA. (This version uses default hashing algorithm, algorithm parameter can be introduced, but it depends on security provider implementation so it is out of scope for this article.)
public class PasswordHashTest extends SeamTest {

    @Test
    public void testPasswordHash() throws Exception {
        new ComponentTest() {
            @Override
            protected void testComponents() throws Exception {
                Log log = Logging.getLog(PasswordHashTest.class);
                PasswordHash passwordHash = PasswordHash.instance();

                final byte[] salt = passwordHash.generateRandomSalt();
                final int iterations = 1000;
                final String password = "admin";
                final String hash = passwordHash.createPasswordKey(password.toCharArray(), salt, iterations);

                assert hash != null : "Hash not calculated!";

                log.info("Password: " + password);
                log.info("Salt: " + BinTools.bin2hex(salt));
                log.info("Iterations: " + iterations);
                log.info("Hash: " + hash);
            }
        }.run();
    }
}
After running this test, all information needed is logged as info message. This test can be run multiple times in case more user accounts are desired.

streda 2. septembra 2009

Jednoduché generovanie manifestu pomocou nástroja ANT


Pohľad na bojové pole
Na jednej strane je J2EE aplikácia zabalená ako EAR, vygenerovaná seam-gen skriptom, ktorá obsahuje EJB modul JAR a webovú časť aplikácie WAR.
Zjednodušená štruktúra aplikácie:
  • myapp.ear
    • myapp-ejb.jar
    • lib
      • richfaces-api.jar
      • ...
    • myapp-web.war
      • WEB-INF
        • lib
          • richfaces-ui.jar
          • richfaces-impl.jar
          • ...
  • myapp.ear/lib - je adresár, ktorý obsahuje knižnice implicitne viditeľné iba z myapp-ejb.jar
  • myapp.ear/myapp-web.war/WEB-INF/lib - knižnice v tomto adresári sú naopak viditeľné len pre myapp-web.war
Na druhej strane, je celkom bežná potreba, použiť niektoré knižnice na oboch miestach. Ako riešiť túto situáciu? Skopírovanie danej knižnice do oboch lib adresárov problém nerieši. Pretože sa môže napríklad stať, že trieda z danej knižnice je návratovou hodnotu metódy session beanu použitej v servlete. V tomto prípade by si web kontajner sťažoval, že trieda, ktorú mu vrátil session bean do servletu nieje tá, ktorú skutočne očakával.

Riešenie
Jedno z riešení spočíva v tom, že knižnice z adresáru myapp.ear/myapp-web.war/WEB-INF/lib okopírujeme do myapp.ear/lib a vytvoríme manifest súbor s classpath a umiestnime ho sem myapp.ear/myapp-web.war/META-INF/MANIFEST.MF.
Modifikovaná štruktúra aplikácie:
  • myapp.ear
    • myapp-ejb.jar
    • lib
      • richfaces-api.jar
      • richfaces-ui.jar
      • richfaces-impl.jar
      • ...
    • myapp-web.war
      • META-INF
        • MANIFEST.MF
Pričom obsah súboru MANIFEST.MF by vyzeral takto:

Class-Path: lib/richfaces-impl.jar lib/richfaces-ui.jar

Za povšimnutie stojí, že cesta v manifeste, je relatívna ku koreňu EAR súboru. Výsledkom je, že všetky knižnice sú viditeľné pre EJB modul a pre web modul sú viditeľné knižnice richfaces-impl.jar a richfaces-ui.jar. Tento manifest súbor sa dá samozrejme napísať aj ručne, ale zvyčajne je v classpath-e oveľa viac záznamov a keďže povolená dĺžka riadku je v manifest špecifikácii je obmedzená ľahko sa môže stať, že ručne napísaný manifest bude chybný.

Generovanie MANIFEST.MF s classpath pomocou ANT
Projekt vygenerovaný seam-gen skriptom používa na kopírovanie knižníc do war súboru následujúcu časť kódu.
<copy todir="${war.dir}/WEB-INF/lib">
  <fileset dir="${lib.dir}">
      <includesfile name="deployed-jars-war.list"/>
      <exclude name="jboss-seam-gen.jar"/>
      <exclude name="jboss-seam-debug.jar" unless="is.debug"/>
  </fileset>
</copy>
  • deployed-jars-war.list je textový súbor, ktorý obsahuje zoznam knižníc, ktoré sa kopírujú do myapp.ear/myapp-web.war/WEB-INF/lib (na jednom riadku je jedno meno knižnice)
V tomto prípade by deployed-jars-war.list vyzeral takto:

richfaces-impl.jar
richfaces-ui.jar


Modifikovaný ANT skript, ktorý nič nekopíruje, ale namiesto toho vytvorí manifest súbor s classpath záznamom.
<path id="war.classpath.path">
   <fileset dir="${lib.dir}">
       <includesfile name="deployed-jars-war.list"/>
       <exclude name="jboss-seam-gen.jar"/>
       <exclude name="jboss-seam-debug.jar" unless="is.debug"/>
   </fileset>
</path>

<manifestclasspath property="war.classpath" jarfile="${project.name}_war">
   <classpath refid="war.classpath.path" />
</manifestclasspath>

<mkdir dir="${war.dir}/META-INF"/>

<manifest file="${war.dir}/META-INF/MANIFEST.MF">
   <attribute name="Class-path" value="${war.classpath}"/>
</manifest>

nedeľa 2. augusta 2009

JBoss clustering konfigurácia pre jeden počítač

Tento tutoriál by mal pomôcť pri nastavení jednoduchého JBoss klastra na jednej pracovnej stanici s operačným systémom Windows Vista. Konfigurácia má význam iba pre programátorov, ktorí potrebujú otestovať, alebo vyvíjať svoj program na klastri, ale nemajú k dispozícii viac počítačov.

Konfigurácia Microsoft Loopback adaptéru s dvoma IP adresami
  1. Ovládací panel/Pridanie hardvéru
  2. Inštalovať hardvér vybratý manuálne v zozname
  3. Vybrať - Sieťové adaptéry
  4. Výber sieťového adaptéra/Microsoft/Microsoft Loopback Adapter
  5. Premenovanie názvu siete na "Loopback" kôli prehľadnosti - Ovládací panel/Sieťové pripojenia
  6. Sieťové pripojenie Loopback/Vlastnosti
  7. Internet Protocol Version 4 (TCP/IPv4)
  8. Nastavenie prvej IP adresy 192.168.1.140 (maska podsiete 255.255.255.0), na ktorej bude počúvať node1
  9. Spresniť/Pridať - Nastavenie druhej IP adresy 192.168.1.141 (maska podsiete 255.255.255.0), na ktorej bude počúvať node2
  10. Otvorenie Windows konzoly/príkaz ipconfig /all, ak je adaptér dobre nainštalovaný a sú nakonfigurované obe IP adresy výstup bude vypadať obdobne ako na obrázku

Inštalácia aplikačného serveru JBoss
  1. Aplikačný server je možné stiahnuť na tejto adrese: http://www.jboss.org/jbossas/downloads/
  2. Rozpakovať/Vytvoriť dva nové profily, node1 a node2, profily vytvoríme tak, že okopírujeme profil all do dvoch nových adresárov node1 a node2 (profily sa nachádzajú v adresári %JBOSS_HOME%/server), adresárová štruktúra by mala vypadať ako na obrázku
  3. V konzole príkaz %JBOSS_HOME%/bin/run.bat -c node1 -b 192.168.1.140 -Djboss.messaging.ServerPeerID=1 spustí profil node1, ktorý bude počúvať na adrese 192.168.1.140 a -Djboss.messaging.ServerPeerID=1 nastaví unikátne ID pre messaging servis, v konzole by mal byť výpis, ktorý hovorí, že klaster má jedného člena, obdobne ako na obrázku
  4. V druhom konzolovom okne spustíme príkaz %JBOSS_HOME%/bin/run.bat -c node2 -b 192.168.1.141 -Djboss.messaging.ServerPeerID=2 a v logu by sme mali vidieť hlásenie o tom, že klaster má už dvoch členov


Ak sa všetko podarilo, tak node1 v prvej konzole by mal zareagovať obdobným logom ako je tento:
18:00:38,092 INFO  [DefaultPartition] New cluster view for partition DefaultPartition (id: 1, delta: 1) : [192.168.1.140:1099, 192.168.1.141:1099]
18:00:38,092 INFO  [RPCManagerImpl] Received new cluster view: [192.168.1.140:49924|1] [192.168.1.140:49924, 192.168.1.141:51781]
18:00:38,108 INFO  [DefaultPartition] I am (192.168.1.140:1099) received membershipChanged event:
18:00:38,108 INFO  [DefaultPartition] Dead members: 0 ([])
18:00:38,108 INFO  [DefaultPartition] New Members : 1 ([192.168.1.141:1099])
18:00:38,108 INFO  [DefaultPartition] All Members : 2 ([192.168.1.140:1099, 192.168.1.141:1099])

nedeľa 26. júla 2009

How to configure PostgreSQL datasource in Embedded JBoss

I have spent more than hour fighting this issue. I will wrote it down rather than forgetting about it. So that in case I have to do this again I will know where to look, and I hope that it might be helpful for others as well.

Environment: Seam 2.1.2, PostgreSQL, JBoss embedded

Scenario: Project generated by seam-gen originally used Hypersonic SQL database (HSQLDB). Later Postgre SQL database (PGSQL) was installed and properly configured with dev build profile. However tests which are running in embedded JBoss still use HSQLDB, the goal is to configure tests to run on PGSQL as well.

Documentation says: By default, a generated project will use the java:/DefaultDS (a built in HSQL datasource in Embedded JBoss) for testing. If you want to use another datasource place the foo-ds.xml into bootstrap/deploy directory.

Sounds easy considering that dev profile is already configured with PGSQL datasource which is woking with ordinary JBoss 5.1.0. One would assume that it is sufficient to copy myproject-dev-ds.xml file to bootstrap/deploy directory (rename it to myproject-test-ds.xml for clarity sake), then change persistence-test.xml according to persistence-dev.xml and all should work.

But in case this is done, running ant test task produces only this incomprehensible error:
org.jboss.deployers.client.spi.IncompleteDeploymentException: Summary of incomplete deployments (SEE PREVIOUS ERRORS FOR DETAILS):
*** CONTEXTS MISSING DEPENDENCIES: Name -> Dependency{Required State:Actual State}
jboss.jca:name=myProjectDatasource,service=DataSourceBinding
-> jboss:service=invoker,type=jrmp{Start:** NOT FOUND **}
-> jboss:service=invoker,type=jrmp{Create:** NOT FOUND **}
*** CONTEXTS IN ERROR: Name -> Error
jboss:service=invoker,type=jrmp -> ** NOT FOUND **
at org.jboss.deployers.plugins.deployers.DeployersImpl.checkComplete(DeployersImpl.java:576)
at org.jboss.deployers.plugins.main.MainDeployerImpl.checkComplete(MainDeployerImpl.java:559)
at org.jboss.embedded.Bootstrap.bootstrapURL(Bootstrap.java:149)
at org.jboss.embedded.Bootstrap.bootstrap(Bootstrap.java:183)
at org.jboss.embedded.Bootstrap.bootstrap(Bootstrap.java:195)
at org.jboss.seam.mock.EmbeddedBootstrap.startAndDeployResources(EmbeddedBootstrap.java:11)
at org.jboss.seam.mock.AbstractSeamTest.startJbossEmbeddedIfNecessary(AbstractSeamTest.java:1024)
at org.jboss.seam.mock.AbstractSeamTest.startSeam(AbstractSeamTest.java:915)
at org.jboss.seam.mock.SeamTest.startSeam(SeamTest.java:58)
... Removed 15 stack frames
where myproject-test-ds.xml looks like this:
<?xml version="1.0" encoding="UTF-8"?>
<datasources>
<local-tx-datasource>
  <jndi-name>myProjectDatasource</jndi-name>

  <use-java-context>false</use-java-context><!-- note this line -->

  <connection-url>jdbc:postgresql://localhost:5432/test</connection-url>
  <driver-class>org.postgresql.Driver</driver-class>
  <user-name>test</user-name>
  <password>test</password>
</local-tx-datasource>
</datasources>
and persistence-test.xml looks like this:
<?xml version="1.0" encoding="UTF-8"?>
<!-- Persistence deployment descriptor for test profile -->
<persistence xmlns="http://java.sun.com/xml/ns/persistence"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xsi:schemaLocation="http://java.sun.com/xml/ns/persistence http://java.sun.com/xml/ns/persistence/persistence_1_0.xsd"
       version="1.0">
<persistence-unit name="myProject">
  <provider>org.hibernate.ejb.HibernatePersistence</provider>

  <jta-data-source>myProjectDatasource</jta-data-source><!-- note this line -->

  <properties>
      <property name="hibernate.dialect" value="org.hibernate.dialect.PostgreSQLDialect"/>
      <property name="hibernate.hbm2ddl.auto" value="create-drop"/>
      <property name="hibernate.show_sql" value="true"/>
      <property name="hibernate.format_sql" value="true"/>
      <property name="jboss.entity.manager.factory.jndi.name" value="java:/myProjectEntityManagerFactory"/>
  </properties>
</persistence-unit>
</persistence>
So both files seem to be configured properly, how come it is not working then? Solution can be found here: https://jira.jboss.org/jira/browse/JBPAPP-2223.

Workaround Description: remove "false" from the datasource file and add "java:/" prefix to the content of jta-data-source element in persistence.xml

And indeed after changing myproject-test-ds.xml like this:
<?xml version="1.0" encoding="UTF-8"?>
<datasources>
 <local-tx-datasource>
     <jndi-name>myProjectDatasource</jndi-name>
     <connection-url>jdbc:postgresql://localhost:5432/test</connection-url>
     <driver-class>org.postgresql.Driver</driver-class>
     <user-name>test</user-name>
     <password>test</password>
 </local-tx-datasource>
</datasources>
and persistence-test.xml like this:
<?xml version="1.0" encoding="UTF-8"?>
<!-- Persistence deployment descriptor for test profile -->
<persistence xmlns="http://java.sun.com/xml/ns/persistence"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
          xsi:schemaLocation="http://java.sun.com/xml/ns/persistence http://java.sun.com/xml/ns/persistence/persistence_1_0.xsd"
          version="1.0">
 <persistence-unit name="myProject">
     <provider>org.hibernate.ejb.HibernatePersistence</provider>

     <jta-data-source>java:/myProjectDatasource</jta-data-source><!-- note this line -->

     <properties>
         <property name="hibernate.dialect" value="org.hibernate.dialect.PostgreSQLDialect"/>
         <property name="hibernate.hbm2ddl.auto" value="create-drop"/>
         <property name="hibernate.show_sql" value="true"/>
         <property name="hibernate.format_sql" value="true"/>
         <property name="jboss.entity.manager.factory.jndi.name" value="java:/myProjectEntityManagerFactory"/>
     </properties>
 </persistence-unit>
</persistence>
everything works like a charm.