Welcome to Xapian
=================

Xapian's build system is built using GNU autoconf, automake, and libtool.
If you've installed other Open Source projects from source, you should
find yourself in familiar territory.  Building and installing involves
the following 3 simple steps:

 1) Run "./configure", possibly with some extra arguments (see below)
 2) Run "make" to build Xapian
 3) Run "make install" to install Xapian

Prerequisites
=============

You'll need to have zlib installed (https://www.zlib.net/) before you can build
Xapian.  The zlib library is very widely used, so you'll probably have it
installed already if you're using Linux, FreeBSD, or similar, but you may need
to install a "zlib development" package to get the zlib library headers.

We recommend using zlib 1.2.x as it apparently fixes a memory leak in
deflateInit2 (which Xapian uses) and decompression is supposed to be about 20%
faster than with 1.1.x, but it's pretty unlikely you'll have an older version
installed these days.

Xapian also requires a way to generate UUIDs.  On FreeBSD, NetBSD, OpenBSD,
AIX and Microsoft Windows, Xapian makes use of built-in UUID APIs.  On Linux
and Android, Xapian 1.4.2 and higher can read UUIDs from a special file under
/proc.  Otherwise you need to install libuuid which you can find in
util-linux (https://github.com/karelzak/util-linux).  On Debian and
Ubuntu, the package to install is uuid-dev, while on Fedora, it is
libuuid-devel (on older Fedora versions you instead need e2fsprogs-devel).

Platforms
=========

Xapian (and particularly xapian-core) aims to be portable to almost all
modern platforms.  In particular, there's no assembler code, we support both
little-endian and big-endian platforms, and both 32-bit and 64-bit
architectures are supported.  There's no fundamental reason why Xapian wouldn't
work on a 16-bit architecture (provided it supports more than 64KB of memory,
e.g. via a segmented memory model) but it's not been tested by anyone as far as
we're aware and some porting work would probably be needed.

There are a small number of assumptions made:

POSIX APIs are available
------------------------

We use POSIX APIs with fall-back code for known cases where support is limited
or missing (Microsoft Windows accounts for the majority of such fall-back
code).  Porting to a new platform which doesn't have good POSIX support may
require adding additional fall-back code.

Character encoding is ASCII-based
---------------------------------

Porting work would be required to run correctly on a platform using a different
character encoding (such as EBCDIC).

Signed integers use two's complement representation
---------------------------------------------------

C++20 has now made this a requirement.  Xapian doesn't yet require C++20, and
older versions of the language standard didn't require this, but apparently all
C++ implementations actually use it anyway.

FLT_RADIX is 2
--------------

It seems all modern platforms use 2 for FLT_RADIX.  This assumption should be
easy to eliminate if you actually have such a platform, but it's hard to
properly test a change to do so without such a platform.

Double precision floating point does not have excess precision
--------------------------------------------------------------

A few older CPU architectures (x86 with 387 FP instructions, and m68k models
68030 and earlier) internally calculate and hold values with excess precision
and reduce precision when storing to memory.

This unfortunately makes some normally easy tasks very hard to perform
consistently, such as ordering results by weight - a key part of what Xapian
does.  Calculating and comparing two weights can give a different order if one
is spilled to memory and reloaded while the other remains in a register (and
when values are spilled depends on the compiler).  This can lead to a > b  and
b > a both evaluating as true, violating the ordering requirements for sort
comparison routines and leading to C++ undefined behaviour.

We have some special-case code which works around this in the place where it
has been reported to cause segfaults, but this doesn't address the general
problem that excess precision can cause calculating the same equations using
the same input values via two different code paths to give different results
and there may be more cases of undefined behaviour due to this which haven't
been uncovered yet.  We don't recommend using such a build, and can't really
support people trying to use it.

For old m68k CPUs we don't have a better solution, but the 68030 was released
in 1987 so can probably be considered obsolete now.

For x86, a simple solution is to use SSE FP instructions instead as these
don't have excess precision.  The configure script in xapian-core will select
SSE2 (which requires a Pentium 4 or later) on x86 if it knows how to for the
compiler in use.

This can be a problem if you're building binary packages for a distribution
which has an x86 baseline which doesn't include SSE2.

If your x86 baseline includes SSE (which requires a Pentium 3 or later) but
not SSE2, then configuring with --enable-sse=sse will select SSE instead.

If your x86 baseline requires a build without SSE, we strongly recommend
providing a build with SSE or SSE2 as well.  On Linux at least, the dynamic
linker will load an alternative version of a shared library from a subdirectory
based on runtime detection of the hardware capabilities of CPU model in use, so
you can satisfy your baseline requires while providing the vast majority of
your users with a better quality build.  See
https://trac.xapian.org/wiki/PackagingXapian for an outline of how the Debian
xapian-core packaging does this.

Compilers
=========

We aim to support compilation with any C++ compiler which supports ISO C++11,
or a reasonable approximation to it.

GCC
---

If you're using GCC, we currently recommend GCC 4.7 or newer (this is the
oldest version we regularly test with).

The current hard minimum requirement is also GCC 4.7 (due to requiring good
support for C++11, for example non-static data member initializers aren't
supported by earlier versions).

If you really still need to use an older version of GCC, Xapian 1.2.x doesn't
require C++11 support and should build with older versions - probably as far
back as GCC 3.1.

Clang
-----

Clang 6.0 is known to work.  We haven't identified the exact minimum
requirement, but it's probably at least Clang 3.2.

MSVC
----

If you're using MS Visual C++, you'll need at least MSVS 2015 for C++11
support.

We support 64-bit compilation with MSVS 2017, 2019 and 2022.  With MSVS 2015 a
64-bit build fails to work - we haven't investigated why.

As of Xapian 1.4.6 building using MSVC is supported by the autotools build
system.  You need to install a set of Unix-like tools first - we recommended
MSYS2: https://www.msys2.org/

You also need to have the MSVC command line tools on your PATH.  MSVC provides
a batch file which sets up PATH and other environment variables.  The exact
details vary by MSVC version, 32- vs 64-bit, and the directory and
drive where MSVC is installed, but for MSVC 2017 it should be something like::

  "C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Auxiliary\Build\vcvars64.bat"

or::

  "C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Auxiliary\Build\vcvars32.bat"

It doesn't work to run this after starting the MSYS2 shell because the
environment variables set by the batch file don't get propagated back to the
MSYS2 shell.  The simplest approach which works is to run this batch file in
the developer command prompt, then start the MSYS2 shell from there (e.g.
by entering the command `start msys2_shell.cmd -use-full-path`).

And for MSVC 2015 32-bit use::

  "C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\vcvarsall.bat" x86

MSVC 2015 64-bit isn't currently supported, but would use the above command but
with ``x64`` instead of ``x86``.

You'll need to have the zlib library available.  You can add
``CPPFLAGS=-I/path/to/zlib LDFLAGS=-L/path/to/zlib`` to the configure command
line to tell MSVC where to find the zlib headers and library.

If you build zlib from source, beware that zlib's `win32/Makefile.msc` performs
a debug build by default.  You'll either need to specify `CXXFLAGS=-MD` when
configuring Xapian, or remove `-MD` from `CFLAGS` in zlib's `win32/Makefile.msc`.
If you're building 64-bit you may also want to remove `-base:0x5A4C0000` from
the link command for `$(SHAREDLIB)` as this triggers a linker warning (LNK4281).

To build Xapian, first run configure from the MSYS2 shell like so::

  ./configure CC="cl -nologo" CXX="$PWD/compile cl -nologo" CXXFLAGS=-EHsc AR=lib LD=link NM=dumpbin

If you want a `.pdb` file (which contains debugging information), add `-Zi` to
`CXXFLAGS`, i.e.:

  ./configure CC="cl -nologo" CXX="$PWD/compile cl -nologo" CXXFLAGS="-EHsc -Zi" AR=lib LD=link NM=dumpbin

Then build using GNU make::

  make

HP's aCC
--------

When using HP's aCC, Xapian must be compiled with +std=c++11, which
xapian-core's configure automatically detects and passes.  You don't have to
pass this option when building code which uses Xapian, but you can.

Sun C++ Compiler
----------------

With this compiler, shared library builds fail (tested most recently with
version 12.6).  You can work around this problem by disabling shared libraries
at configure time like so::

  ./configure --disable-shared

Multi-Arch
==========

When using GCC on platforms which support multiple architectures, the simplest
way to select a non-default architecture is to pass a CXX setting to configure
which includes the appropriate -m option - e.g. to build for x86 on x86-64
you would configure with:

./configure CXX='g++ -m32'

Building in a separate directory
================================

If you wish to perform your build in a separate directory from the source,
create and change to the build directory, and run the configure script (in
the source directory) from the build directory, like so:

  mkdir BUILD
  cd BUILD
  ../configure

Options to give to configure
============================

--enable-assertions
	You should use this to build a version of Xapian with many internal
	consistency checks.  This will run more slowly, but is useful if you
	suspect a bug in Xapian.

--enable-backend-chert
--enable-backend-glass
--enable-backend-inmemory
--enable-backend-remote
	These options enable (or disable if --disable-backend-XXX is specified)
	the compiling of each backend (database access methods).  By default,
	all backends for which the appropriate libraries and OS support are
	available will be enabled.  Note: Currently disabling the remote
	backend also disables replication (because the network code is shared).

_FORTIFY_SOURCE
---------------

When compiling with GCC, by default Xapian will be built with _FORTIFY_SOURCE
set to 2 (except on mingw-w64).  This enables some compile time and runtime
checking of values passed to library functions when building with glibc >=
2.34.  If you wish to disable this for any reason, you can just configure like
so:

./configure CPPFLAGS=-D_FORTIFY_SOURCE=0

Or you can set the "fortification level" to 1 or (with new enough glibc and
GCC) 3 instead of 2:

./configure CPPFLAGS=-D_FORTIFY_SOURCE=1

./configure CPPFLAGS=-D_FORTIFY_SOURCE=3

If you're disabling _FORTIFY_SOURCE because it causes problems, please also
report this to us (via the bug tracker or mailing lists).

On mingw-w64 Xapian doesn't automatically enable _FORTIFY_SOURCE as an extra
library is needed.  You can enable it by hand and specify this library like
so:

./configure CPPFLAGS=-D_FORTIFY_SOURCE=2 LIBS=-lssp

-Bsymbolic-functions
--------------------

If -Wl,-Bsymbolic-functions is supported (for example it is by GCC with modern
ld) then it will be automatically used when linking the library.  This causes
all references from inside the library to symbols inside the library to be
resolved when the library is created, rather than when the shared library is
loaded, which decreases the time taken to load the library, reduces its size,
and is also likely to make the code run a little faster.

Should you wish to disable this for some reason, you can configure like so
which disables the probe for -Bsymbolic-functions so it won't ever be used:

./configure xo_cv_symbolic_functions=no

If you're disabling it because it causes problems, please also report this to
us (via the bug tracker or mailing lists).

-fvisibility
------------

When compiling with GCC >= 4.0 for platforms which support symbol visibility,
we automatically pass -fvisibility=hidden to g++ when building the library, and
mark classes, methods, and functions which need exporting with attributes to
make them visible.

Should you wish to disable this for some reason, you can configure like so:

./configure --disable-visibility

If you're disabling it because it causes problems, please also report this to
us (via the bug tracker or mailing lists).

Developers
==========

There are additional scripts and configure options to help people doing
development work on Xapian itself, and people who are building from git.
Read HACKING to find out about them.
