2192 lines
104 KiB
Text
2192 lines
104 KiB
Text
Copyright © 2009 CNRS
|
|
Copyright © 2009-2024 Inria. All rights reserved.
|
|
Copyright © 2009-2013 Université Bordeaux
|
|
Copyright © 2009-2011 Cisco Systems, Inc. All rights reserved.
|
|
Copyright © 2020 Hewlett Packard Enterprise. All rights reserved.
|
|
|
|
$COPYRIGHT$
|
|
|
|
Additional copyrights may follow
|
|
|
|
$HEADER$
|
|
|
|
===========================================================================
|
|
|
|
This file contains the main features as well as overviews of specific
|
|
bug fixes (and other actions) for each version of hwloc since version
|
|
0.9.
|
|
|
|
|
|
Version 2.11.2
|
|
--------------
|
|
* Add missing CPU info attrs on aarch64 on Linux.
|
|
* Use ACPI CPPC on Linux to get better information about cpukinds,
|
|
at least on AMD CPUs.
|
|
* Fix crash when manipulating cpukinds after topology
|
|
duplication, thanks to Hadrien Grasland for the report.
|
|
* Fix missing input target checks in memattr functions,
|
|
thanks to Hadrien Grasland for the report.
|
|
* Fix a memory leak when ignoring NUMA distances on FreeBSD.
|
|
* Fix build failure on old Linux distributions without accessat().
|
|
* Fix non-Windows importing of XML topologies and CPUID dumps exported
|
|
on Windows.
|
|
* hwloc-calc --cpuset-output-format systemd-dbus-api now allows
|
|
to generate AllowedCPUs information for systemd slices.
|
|
See the hwloc-calc manpage for examples. Thanks to Pierre Neyron.
|
|
* Some fixes in manpage EXAMPLES and split them into subsections.
|
|
|
|
|
|
Version 2.11.1
|
|
--------------
|
|
* Fix bash completions, thanks Tavis Rudd.
|
|
|
|
|
|
Version 2.11.0
|
|
--------------
|
|
* API
|
|
+ Add HWLOC_MEMBIND_WEIGHTED_INTERLEAVE memory binding policy on
|
|
Linux 6.9+. Thanks to Honggyu Kim for the patch.
|
|
- weighted_interleave_membind is added to membind support bits.
|
|
- The "weighted" policy is added to the hwloc-bind tool.
|
|
+ Add hwloc_obj_set_subtype(). Thanks to Hadrien Grasland for the report.
|
|
* GPU support
|
|
+ Don't hide the GPU NUMA node on NVIDIA Grace Hopper.
|
|
+ Get Intel GPU OpenCL device locality.
|
|
+ Add bandwidths between subdevices in the LevelZero XeLinkBandwidth
|
|
matrix.
|
|
+ Fix PCI Gen4+ link speed of NVIDIA GPU obtained from NVML,
|
|
thanks to Akram Sbaih for the report.
|
|
* Windows support
|
|
+ Fix Windows support when UNICODE is enabled, several hwloc features
|
|
were missing, thanks to Martin for the report.
|
|
+ Fix the enabling of CUDA in Windows CMake build,
|
|
Thanks to Moritz Kreutzer for the patch.
|
|
+ Fix CUDA/OpenCL test source path in Windows CMake.
|
|
* Tools
|
|
+ Option --best-memattr may now return multiple nodes. Additional
|
|
configuration flags may be given to tweak its behavior.
|
|
+ hwloc-info has a new --get-attr option to get a single attribute.
|
|
+ hwloc-info now supports "levels", "support" and "topology"
|
|
special keywords for backward compatibility for hwloc 3.0.
|
|
+ The --taskset command-line option is superseded by the new
|
|
--cpuset-output-format which also allows to export as list.
|
|
+ hwloc-calc may now import bitmasks described as a list of bits
|
|
with the new "--cpuset-input-format list".
|
|
* Misc
|
|
+ The MemoryTiersNr info attribute in the root object now says how many
|
|
memory tiers were built. Thanks to Antoine Morvan for the report.
|
|
+ Fix the management of infinite cpusets in the bitmap printf/sscanf
|
|
API as well as in command-line tools.
|
|
+ Add section "Compiling software on top of hwloc's C API" in the
|
|
documentation with examples for GNU Make and CMake,
|
|
thanks to Florent Pruvost for the help.
|
|
|
|
|
|
Version 2.10.0
|
|
--------------
|
|
* Heterogeneous Memory core improvements
|
|
+ Better heuristics to identify the subtype of memory such as HBM,
|
|
DRAM, NVM, CXL-DRAM, etc.
|
|
+ Build memory tiers, i.e. sets of NUMA nodes with the same subtype
|
|
and similar performance.
|
|
- NUMA node tier ranks are exposed in the new MemoryTier info
|
|
attribute (starts from 0 for highest bandwidth tier)..
|
|
+ See the new Heterogeneous Memory section in the documentation.
|
|
* API
|
|
+ Add hwloc_topology_free_group_object() to discard a Group created
|
|
by hwloc_topology_alloc_group_object().
|
|
* Linux backend
|
|
+ Fix cpukinds on NVIDIA Grace to report identical cores even if they
|
|
actually have very small frequency differences.
|
|
Thanks to John C. Linford for the report.
|
|
+ Add CXLDevice attributes to CXL DAX objects and NUMA nodes to show
|
|
which PCI device implements which window.
|
|
+ Ignore buggy memory-side caches and memory attributes when fake NUMA
|
|
emulation is enabled on the Linux kernel command-line.
|
|
+ Add more info attributes in MemoryModule Misc objects,
|
|
thanks to Zubiao Xiong for the patch.
|
|
+ Get CPUModel and CPUFamily info attributes on LoongArch platforms.
|
|
* x86 backend
|
|
+ Add support for new AMD CPUID leaf 0x80000026 for better detection
|
|
of Core Complex and Die on Zen4 processors.
|
|
+ Improve Zhaoxin CPU topology detection.
|
|
* Tools
|
|
+ Input locations and many command-line options (e.g. hwloc-calc -I -N -H,
|
|
lstopo --only) now accept filters such as "NUMA[HBM]" so that only
|
|
objects are that type and subtype are considered.
|
|
- NUMA[tier=1] is also accepted for selecting NUMA nodes depending
|
|
on their MemoryTier info attribute.
|
|
+ Add --object-output to hwloc-calc to report the type as a prefix to
|
|
object indexes, e.g. Core:2 instead of 2 in the output of -I.
|
|
+ hwloc-info --ancestor and --descendants now accepts kinds of objects
|
|
instead of single types.
|
|
- The new --first option only shows the first matching object.
|
|
+ Add --children-of-pid to hwloc-ps to show a hierarchy of processes.
|
|
Thanks to Antoine Morvan for the suggestion.
|
|
+ Add --misc-from to lstopo to add Misc objects described in a file.
|
|
- To be combined with the new hwloc-ps --lstopo-misc for a customizable
|
|
lstopo --top replacement.
|
|
* Misc
|
|
+ lstopo may now configure the layout of memory object placed above,
|
|
for instance with --children-order memory:above:vert.
|
|
+ Fix XML import from memory or stdin when using libxml2 2.12.
|
|
+ Fix installation failures when configuring with --target,
|
|
thanks to Clement Foyer for the patch.
|
|
+ Fix support for 128bit pointer architectures.
|
|
+ Remove Netloc.
|
|
|
|
|
|
Version 2.9.3
|
|
-------------
|
|
* Handle Linux glibc allocation errors in binding routines (CVE-2022-47022).
|
|
* Fix hwloc-calc when searching objects on heterogeneous memory platforms,
|
|
thanks to Antoine Morvan for the report.
|
|
* Fix hwloc_get_next_child() when there are some memory-side caches.
|
|
* Don't crash if the topology is empty because Linux cgroups are wrong.
|
|
* Improve some hwloc-bind warnings in case of command-line parsing errors.
|
|
* Many documentation improvements all over the place, including:
|
|
+ hwloc_topology_restrict() and hwloc_topology_insert_group() may reorder
|
|
children, causing the logical indexes of objects to change.
|
|
|
|
|
|
Version 2.9.2
|
|
-------------
|
|
* Don't forget L3i when defining filters for multiple levels of caches
|
|
with hwloc_topology_set_cache/icache_types_filter().
|
|
* Fix object total_memory after hwloc_topology_insert_group_object().
|
|
* Fix the (non-yet) exporting in synthetic description for complex memory
|
|
hierarchies with memory-side caches, etc.
|
|
* Fix some default size attributes when building synthetic topologies.
|
|
* Fix size units in hwloc-annotate.
|
|
* Improve bitmap reallocation error management in many functions.
|
|
* Documentation improvements:
|
|
+ Better document return values of functions.
|
|
+ Add "Error reporting" section (in hwloc.h and in the doxygen doc).
|
|
+ Add FAQ entry "What may I disable to make hwloc faster?"
|
|
+ Improve FAQ entries "Why is lstopo slow?" and
|
|
"I only need ..., why should I use hwloc?"
|
|
+ Clarify how to deal with cpukinds in hwloc-calc and hwloc-bind
|
|
manpages.
|
|
|
|
|
|
Version 2.9.1
|
|
-------------
|
|
* Don't forget to apply object type filters to "perflevel" caches detected
|
|
on recent Mac OS X releases, thanks to Michel Lesoinne for the report.
|
|
* Fix a failed assertion in hwloc_topology_restrict() when some NUMA nodes
|
|
are removed because of HWLOC_RESTRICT_FLAG_REMOVE_CPULESS but no PUs are.
|
|
Thanks to Mark Grondona for reporting the issue.
|
|
* Mark HPE Cray Slingshot NICs with subtype "Slingshot".
|
|
|
|
|
|
Version 2.9.0
|
|
-------------
|
|
* Backends
|
|
+ Expose the memory size of CXL memory devices (Type 3) on Linux.
|
|
+ The LevelZero backend now reports the "XeLinkBandwidth" distance
|
|
matrix between L0 devices (and subdevices) when available.
|
|
+ Add support for CUDA compute capability up to 9.0.
|
|
* Tools
|
|
+ lstopo now switches to console mode when its output is redirected.
|
|
Graphical window mode may be forced back with --of window.
|
|
+ hwloc-calc now accepts "numa" in -H, and I/O subtypes such as "gpu"
|
|
in -I and -N.
|
|
|
|
|
|
Version 2.8.0
|
|
-------------
|
|
* API
|
|
+ Add HWLOC_TOPOLOGY_FLAG_NO_DISTANCES, _NO_MEMATTRS and _NO_CPUKINDS
|
|
to reduce the overhead when unneeded.
|
|
+ Add separate Read/Write Bandwidth/Latency memory attributes and
|
|
implement them on Linux.
|
|
* Backends
|
|
+ NUMA nodes may now have a subtype such as DRAM, HBM, SPM, or NVM
|
|
on heterogeneous memory platforms on Linux.
|
|
- Add DAXType and DAXParent attributes on Linux to tell where a
|
|
DAX device or its corresponding NUMA node come from (SPM for
|
|
Specific-Purpose or NVM for Non-Volatile Memory).
|
|
+ Detect heterogeneous caches in hybrid CPUs on MacOS X,
|
|
thanks to Paul Bone for the help.
|
|
+ Max frequencies are not ignored in Linux cpukinds anymore (they were
|
|
ignored in hwloc 2.7.0), but they may be slightly adjusted to avoid
|
|
reporting hybrid CPUs because Intel Turbo Boost Max 3.0.
|
|
- See the documentation of environment variable HWLOC_CPUKINDS_MAXFREQ.
|
|
+ Hardwire the PCI locality of HPE Cray EX235a nodes.
|
|
* Tools
|
|
+ lstopo and other tools may now load Linux and x86 cpuid topology files
|
|
from a tarball.
|
|
+ lstopo may now replace the P# and L# index prefixes with custom strings
|
|
thanks to --os-index-prefix and --logical-index-prefix options.
|
|
* Misc
|
|
+ Add --disable-readme to avoid regenerating the top-level hwloc README
|
|
file from the documentation.
|
|
|
|
|
|
Version 2.7.2
|
|
-------------
|
|
* Fix a crash when LevelZero devices have multiple subdevices,
|
|
e.g. on PonteVecchio GPUs, thanks to Jonathan Peyton.
|
|
* Fix a leak when importing cpukinds from XML,
|
|
thanks to Hui Zhou.
|
|
|
|
|
|
Version 2.7.1
|
|
-------------
|
|
* Workaround crashes when virtual machines report incoherent x86 CPUID
|
|
information about numbers of cores and threads.
|
|
Thanks to Peter Bense for the report.
|
|
* Use setenv() instead of putenv() when trying to force enable oneAPI L0
|
|
support, to avoid issues with applications that touch the environment,
|
|
thanks to Josh Hursey for the patch.
|
|
* Add some warnings at the end of configure when GPU libraries are
|
|
missing on the system or their path is missing in the environment.
|
|
|
|
|
|
Version 2.7.0
|
|
-------------
|
|
* Backends
|
|
+ Add support for NUMA nodes and caches with more than 64 PUs across
|
|
multiple processor groups on Windows 11 and Windows Server 2022.
|
|
+ Group objects are not created for Windows processor groups anymore,
|
|
except if HWLOC_WINDOWS_PROCESSOR_GROUP_OBJS=1 in the environment.
|
|
+ Expose "Cluster" group objects on Linux kernel 5.16+ for CPUs
|
|
that share some internal cache or bus. This can be equivalent
|
|
to the L2 Cache level on some platforms (e.g. x86) or a specific
|
|
level between L2 and L3 on others (e.g. ARM Kungpeng 920).
|
|
Thanks to Jonathan Cameron for the help.
|
|
- HWLOC_DONT_MERGE_CLUSTER_GROUPS=1 may be set in the environment
|
|
to prevent these groups from being merged with identical caches, etc.
|
|
+ Improve the oneAPI LevelZero backend:
|
|
- Expose subdevices such as "ze0.1" inside root OS devices ("ze0")
|
|
when the hardware contains multiple subdevices.
|
|
- Add many new attributes to describe device type, and the
|
|
numbers of slices, subslices, execution units and threads.
|
|
- Expose the memory information as LevelZeroHBM/DDR/MemorySize infos.
|
|
+ Ignore the max frequencies of cores in Linux cpukinds when the
|
|
base frequencies are available (to avoid exposing hybrid CPUs
|
|
when Intel Turbo Boost Max 3.0 gives slightly different max
|
|
frequencies to CPU cores).
|
|
- May be reverted by setting HWLOC_CPUKINDS_MAXFREQ=1 in the environment.
|
|
* Tools
|
|
+ Add --grey and --palette options to switch lstopo to greyscale or
|
|
white-background-only graphics, or to tune individual colors.
|
|
* Build
|
|
+ Windows CMake builds now support non-MSVC compilers, detect several
|
|
features at build time, can build/run tests, etc.
|
|
Thanks to Michael Hirsch and Alexander Neumann .
|
|
|
|
|
|
Version 2.6.0
|
|
-------------
|
|
* Backends
|
|
+ Expose two cpukinds for energy-efficient cores (icestorm) and
|
|
high-performance cores (firestorm) on Apple M1 on Mac OS X.
|
|
+ Use sysfs CPU "capacity" to rank hybrid cores by efficiency
|
|
on Linux when available (mostly on recent ARM platforms for now).
|
|
+ Improve HWLOC_MEMBIND_BIND (without the STRICT flag) on Linux kernel
|
|
>= 5.15: If more than one node is given, the kernel may now use all
|
|
of them instead of only the first one before falling back to others.
|
|
+ Expose cache os_index when available on Linux, it may be needed
|
|
when using resctrl to configure cache partitioning, memory bandwidth
|
|
monitoring, etc.
|
|
+ Add a "XGMIHops" distances matrix in the RSMI backend for AMD GPU
|
|
interconnected through XGMI links.
|
|
+ Expose AMD GPU memory information (VRAM and GTT) in the RSMI backend.
|
|
+ Add OS devices such as "bxi0" for Atos/Bull BXI HCAs on Linux.
|
|
* Tools
|
|
+ lstopo has a better placement algorithm with respect to I/O
|
|
objects, see --children-order in the manpage for details.
|
|
+ hwloc-annotate may now change object subtypes and cache or memory
|
|
sizes.
|
|
* Build
|
|
+ Allow to specify the ROCm installation for building the RSMI backend:
|
|
- Use a custom installation path if specified with --with-rocm=<dir>.
|
|
- Use /opt/rocm-<version> if specified with --with-rocm-version=<version>
|
|
or the ROCM_VERSION environment variable.
|
|
- Try /opt/rocm if it exists.
|
|
- See "How do I enable ROCm SMI and select which version to use?"
|
|
in the FAQ for details.
|
|
+ Add a CMakeLists for Windows under contrib/windows-cmake/ .
|
|
* Documentation
|
|
+ Add FAQ entry "How do I create a custom heterogeneous and
|
|
asymmetric topology?"
|
|
|
|
|
|
Version 2.5.0
|
|
-------------
|
|
* API
|
|
+ Add hwloc/windows.h to query Windows processor groups.
|
|
+ Add hwloc_get_obj_with_same_locality() to convert between objects
|
|
with same locality, for instance NUMA nodes and Packages,
|
|
or OS devices within a PCI device.
|
|
+ Add hwloc_distances_transform() to modify distances structures.
|
|
- hwloc-annotate and lstopo have new distances-transform options.
|
|
+ hwloc_distances_add() is replaced with _add_create() followed by
|
|
_add_values() and _add_commit(). See hwloc/distances.h for details.
|
|
+ Add topology flags to mitigate binding modifications during
|
|
hwloc discovery, especially on Windows:
|
|
- HWLOC_TOPOLOGY_FLAG_RESTRICT_TO_CPUBINDING and _MEMBINDING
|
|
restrict discovery to PUs and NUMA nodes inside the binding.
|
|
- HWLOC_TOPOLOGY_FLAG_DONT_CHANGE_BINDING prevents from ever
|
|
changing the binding during discovery.
|
|
* Backends
|
|
+ Add a levelzero backend for oneAPI L0 devices, exposed as OS devices
|
|
of subtype "LevelZero" and name such as "ze0".
|
|
- Add hwloc/levelzero.h for interoperability between converting
|
|
between L0 API devices and hwloc cpusets or OS devices.
|
|
+ Expose NEC Vector Engine cards on Linux as OS devices of subtype
|
|
"VectorEngine" and name "ve0", etc.
|
|
Thanks to Anara Kozhokanova, Tim Cramer and Erich Focht for the help.
|
|
+ Add a NVLinkBandwidth distances structure between NVIDIA GPUs
|
|
(and POWER processor or NVSwitches) in the NVML backend,
|
|
and a XGMIBandwidth distances structure between AMD GPUs
|
|
in the RSMI backends.
|
|
- See "Topology Attributes: Distances, Memory Attributes and CPU Kinds"
|
|
in the documentation for details about these new distances.
|
|
+ Add support for NUMA node 0 being offline in Linux, thanks to Jirka Hladky.
|
|
* Build
|
|
+ Add --with-cuda-version=<version> or look at the CUDA_VERSION
|
|
environment variable to find the appropriate CUDA pkg-config files.
|
|
Thanks to Stephen Herbein for the suggestion.
|
|
- Also add --with-cuda=<dir> to specify the CUDA installation path
|
|
manually (and its NVML and OpenCL components).
|
|
Thanks to Andrea Bocci for the suggestion.
|
|
- See "How do I enable CUDA and select which CUDA version to use?"
|
|
in the FAQ for details.
|
|
* Tools
|
|
+ lstopo now has a --windows-processor-groups option on Windows.
|
|
+ hwloc-ps now has a --short-name option to avoid long/truncated
|
|
command path.
|
|
+ hwloc-ps now has a --single-ancestor option to return a single
|
|
(possibly too large) object where a process is bound.
|
|
+ hwloc-ps --pid-cmd may now query environment variables,
|
|
including MPI-specific variables to find out process ranks.
|
|
|
|
|
|
Version 2.4.1
|
|
-------------
|
|
* Fix AMD OpenCL device locality when PCI bus or device number >= 128.
|
|
Thanks to Edgar Leon for reporting the issue.
|
|
+ Applications using any of the following inline functions must
|
|
be recompiled to get the fix: hwloc_opencl_get_device_pci_busid()
|
|
hwloc_opencl_get_device_cpuset(), hwloc_opencl_get_device_osdev().
|
|
* Fix the ranking of cpukinds on non-Windows systems,
|
|
thanks to Ivan Kochin for the report.
|
|
* Fix the insertion of custom Groups after loading the topology,
|
|
thanks to Scott Hicks.
|
|
* Add support for CPU0 being offline in Linux, thanks to Garrett Clay.
|
|
* Fix missing x86 Package and Core objects FreeBSD/NetBSD.
|
|
Thanks to Thibault Payet and Yuri Victorovich for the report.
|
|
* Fix the import of very large distances with heterogeneous object types.
|
|
* Fix a memory leak in the Linux backend,
|
|
thanks to Perceval Anichini.
|
|
|
|
|
|
Version 2.4.0
|
|
-------------
|
|
* API
|
|
+ Add hwloc/cpukinds.h for reporting information about hybrid CPUs.
|
|
- Use Linux cpufreq frequencies to rank cores by efficiency.
|
|
- Use x86 CPUID hybrid leaf and future Linux kernels sysfs CPU type
|
|
files to identify Intel Atom and Core cores.
|
|
- Use the Windows native EfficiencyClass to separate kinds.
|
|
* Backends
|
|
+ Properly handle Linux kernel 5.10+ exposing ACPI HMAT information
|
|
with knowledge of Generic Initiators.
|
|
* Tools
|
|
+ lstopo has new --cpukinds and --no-cpukinds options for showing
|
|
CPU kinds or not in textual and graphical modes respectively.
|
|
+ hwloc-calc has a new --cpukind option for filtering PUs by kind.
|
|
+ hwloc-annotate has a new cpukind command for modifying CPU kinds.
|
|
* Misc
|
|
+ Fix hwloc_bitmap_nr_ulongs(), thanks to Norbert Eicker.
|
|
+ Add a documentation section about
|
|
"Topology Attributes: Distances, Memory Attributes and CPU Kinds".
|
|
+ Silence some spurious warnings in the OpenCL backend and when showing
|
|
process binding with lstopo --ps.
|
|
|
|
|
|
Version 2.3.0
|
|
-------------
|
|
* API
|
|
+ Add hwloc/memattrs.h for exposing latency/bandwidth information
|
|
between initiators (CPU sets for now) and target NUMA nodes,
|
|
typically on heterogeneous platforms.
|
|
- When available, bandwidths and latencies are read from the ACPI HMAT
|
|
table exposed by Linux kernel 5.2+.
|
|
- Attributes may also be customized to expose user-defined performance
|
|
information.
|
|
+ Add hwloc_get_local_numanode_objs() for listing NUMA nodes that are
|
|
local to some locality.
|
|
+ The new topology flag HWLOC_TOPOLOGY_FLAG_IMPORT_SUPPORT causes
|
|
support arrays to be loaded from XML exported with hwloc 2.3+.
|
|
- hwloc_topology_get_support() now returns an additional "misc"
|
|
array with feature "imported_support" set when support was imported.
|
|
+ Add hwloc_topology_refresh() to refresh internal caches after modifying
|
|
the topology and before consulting the topology in a multithread context.
|
|
* Backends
|
|
+ Add a ROCm SMI backend and a hwloc/rsmi.h helper file for getting
|
|
the locality of AMD GPUs, now exposed as "rsmi" OS devices.
|
|
Thanks to Mike Li.
|
|
+ Remove POWER device-tree-based topology on Linux,
|
|
(it was disabled by default since 2.1).
|
|
* Tools
|
|
+ Command-line options for specifying flags now understand comma-separated
|
|
lists of flag names (substrings).
|
|
+ hwloc-info and hwloc-calc have new --local-memory --local-memory-flags
|
|
and --best-memattr options for reporting local memory nodes and filtering
|
|
by memory attributes.
|
|
+ hwloc-bind has a new --best-memattr option for filtering by memory attributes
|
|
among the memory binding set.
|
|
+ Tools that have a --restrict option may now receive a nodeset or
|
|
some custom flags for restricting the topology.
|
|
+ lstopo now has a --thickness option for changing line thickness in the
|
|
graphical output.
|
|
+ Fix lstopo drawing when autoresizing on Windows 10.
|
|
+ Pressing the F5 key in lstopo X11 and Windows graphical/interactive outputs
|
|
now refreshes the display according to the current topology and binding.
|
|
+ Add a tikz lstopo graphical backend to generate picture easily included into
|
|
LaTeX documents. Thanks to Clement Foyer.
|
|
* Misc
|
|
+ The default installation path of the Bash completion file has changed to
|
|
${datadir}/bash-completion/completions/hwloc. Thanks to Tomasz Kłoczko.
|
|
|
|
|
|
Version 2.2.0
|
|
-------------
|
|
* API
|
|
+ Add hwloc_bitmap_singlify_by_core() to remove SMT from a given cpuset,
|
|
thanks to Florian Reynier for the suggestion.
|
|
+ Add --enable-32bits-pci-domain to stop ignoring PCI devices with domain
|
|
>16bits (e.g. 10000:02:03.4). Enabling this option breaks the library ABI.
|
|
Thanks to Dylan Simon for the help.
|
|
* Backends
|
|
+ Add support for Linux cgroups v2.
|
|
+ Add NUMA support for FreeBSD.
|
|
+ Add get_last_cpu_location support for FreeBSD.
|
|
+ Remove support for Intel Xeon Phi (MIC, Knights Corner) co-processors.
|
|
* Tools
|
|
+ Add --uid to filter the hwloc-ps output by uid on Linux.
|
|
+ Add a GRAPHICAL OUTPUT section in the manpage of lstopo.
|
|
* Misc
|
|
+ Use the native dlopen instead of libltdl,
|
|
unless --disable-plugin-dlopen is passed at configure time.
|
|
|
|
|
|
Version 2.1.0
|
|
-------------
|
|
* API
|
|
+ Add a new "Die" object (HWLOC_OBJ_DIE) for upcoming x86 processors
|
|
with multiple dies per package, in the x86 and Linux backends.
|
|
+ Add the new HWLOC_OBJ_MEMCACHE object type for memory-side caches.
|
|
- They are filtered-out by default, except in command-line tools.
|
|
- They are only available on very recent platforms running Linux 5.2+
|
|
and uptodate ACPI tables.
|
|
- The KNL MCDRAM in cache mode is still exposed as a L3 unless
|
|
HWLOC_KNL_MSCACHE_L3=0 in the environment.
|
|
+ Add HWLOC_RESTRICT_FLAG_BYNODESET and _REMOVE_MEMLESS for restricting
|
|
topologies based on some memory nodes.
|
|
+ Add hwloc_topology_set_components() for blacklisting some components
|
|
from being enabled in a topology.
|
|
+ Add hwloc_bitmap_nr_ulongs() and hwloc_bitmap_from/to_ulongs(),
|
|
thanks to Junchao Zhang for the suggestion.
|
|
+ Improve the API for dealing with disallowed resources
|
|
- HWLOC_TOPOLOGY_FLAG_WHOLE_SYSTEM is replaced with FLAG_INCLUDE_DISALLOWED
|
|
and --whole-system command-line options with --disallowed.
|
|
. Former names are still accepted for backward compatibility.
|
|
- Add hwloc_topology_allow() for changing allowed sets after load().
|
|
- Add the HWLOC_ALLOW=all environment variable to totally ignore
|
|
administrative restrictions such as Linux Cgroups.
|
|
- Add disallowed_pu and disallowed_numa bits to the discovery support
|
|
structure.
|
|
+ Group objects have a new "dont_merge" attribute to prevent them from
|
|
being automatically merged with identical parent or children.
|
|
+ Add more distances-related features:
|
|
- Add hwloc_distances_get_name() to retrieve a string describing
|
|
what a distances structure contain.
|
|
- Add hwloc_distances_get_by_name() to retrieve distances structures
|
|
based on their name.
|
|
- Add hwloc_distances_release_remove()
|
|
- Distances may now cover objects of different types with new kind
|
|
HWLOC_DISTANCES_KIND_HETEROGENEOUS_TYPES.
|
|
* Backends
|
|
+ Add support for Linux 5.3 new sysfs cpu topology files with Die information.
|
|
+ Add support for Intel v2 Extended Topology Enumeration in the x86 backend.
|
|
+ Improve memory locality on Linux by using HMAT initiators (exposed
|
|
since Linux 5.2+), and NUMA distances for CPU-less NUMA nodes.
|
|
+ The x86 backend now properly handles offline CPUs.
|
|
+ Detect the locality of NVIDIA GPU OpenCL devices.
|
|
+ Ignore NUMA nodes that correspond to NVIDIA GPU by default.
|
|
- They may be unignored if HWLOC_KEEP_NVIDIA_GPU_NUMA_NODES=1 in the environment.
|
|
- Fix their CPU locality and add info attributes to identify them.
|
|
Thanks to Max Katz and Edgar Leon for the help.
|
|
+ Add support for IBM S/390 drawers.
|
|
+ Rework the heuristics for discovering KNL Cluster and Memory modes
|
|
to stop assuming all CPUs are online (required for mOS support).
|
|
Thanks to Sharath K Bhat for testing patches.
|
|
+ Ignore NUMA node information from AMD topoext in the x86 backend,
|
|
unless HWLOC_X86_TOPOEXT_NUMANODES=1 is set in the environment.
|
|
+ Expose Linux DAX devices as hwloc Block OS devices.
|
|
+ Remove support for /proc/cpuinfo-only topology discovery in Linux
|
|
kernel prior to 2.6.16.
|
|
+ Disable POWER device-tree-based topology on Linux by default.
|
|
- It may be reenabled by setting HWLOC_USE_DT=1 in the environment.
|
|
+ Discovery components are now divided in phases that may be individually
|
|
blacklisted.
|
|
- The linuxio component has been merged back into the linux component.
|
|
* Tools
|
|
+ lstopo
|
|
- lstopo factorizes objects by default in the graphical output when
|
|
there are more than 4 identical children.
|
|
. New options --no-factorize and --factorize may be used to configure this.
|
|
. Hit the 'f' key to disable factorizing in interactive outputs.
|
|
- Both logical and OS/physical indexes are now displayed by default
|
|
for PU and NUMA nodes.
|
|
- The X11 and Windows interactive outputs support many keyboard
|
|
shortcuts to dynamically customize the attributes, legend, etc.
|
|
- Add --linespacing and change default margins and linespacing.
|
|
- Add --allow for changing allowed sets.
|
|
- Add a native SVG backend. Its graphical output may be slightly less
|
|
pretty than Cairo (still used by default if available) but the SVG
|
|
code provides attributes to manipulate objects from HTML/JS.
|
|
See dynamic_SVG_example.html for an example.
|
|
+ Add --nodeset options to hwloc-calc for converting between cpusets and
|
|
nodesets.
|
|
+ Add --no-smt to lstopo, hwloc-bind and hwloc-calc to ignore multiple
|
|
PU in SMT cores.
|
|
+ hwloc-annotate may annotate multiple locations at once.
|
|
+ Add a HTML/JS version of hwloc-ps. See contrib/hwloc-ps.www/README.
|
|
+ Add bash completions.
|
|
* Misc
|
|
+ Add several FAQ entries in "Compatibility between hwloc versions"
|
|
about API version, ABI, XML, Synthetic strings, and shmem topologies.
|
|
|
|
|
|
Version 2.0.4 (also included in 1.11.13 when appropriate)
|
|
-------------
|
|
* Add support for Linux 5.3 new sysfs cpu topology files with Die information.
|
|
* Add support for Intel v2 Extended Topology Enumeration in the x86 backend.
|
|
* Tiles, Modules and Dies are exposed as Groups for now.
|
|
+ HWLOC_DONT_MERGE_DIE_GROUPS=1 may be set in the environment to prevent
|
|
Die groups from being automatically merged with identical parent or children.
|
|
* Ignore NUMA node information from AMD topoext in the x86 backend,
|
|
unless HWLOC_X86_TOPOEXT_NUMANODES=1 is set in the environment.
|
|
* Group objects have a new "dont_merge" attribute to prevent them from
|
|
being automatically merged with identical parent or children.
|
|
|
|
|
|
Version 2.0.3 (also included in 1.11.12 when appropriate)
|
|
-------------
|
|
* Fix build on Cygwin, thanks to Marco Atzeri for the patches.
|
|
* Fix a corner case of hwloc_topology_restrict() where children would
|
|
become out-of-order.
|
|
* Fix the return length of export_xmlbuffer() functions to always
|
|
include the ending \0.
|
|
* Fix lstopo --children-order argument parsing.
|
|
|
|
|
|
Version 2.0.2 (also included in 1.11.11 when appropriate)
|
|
-------------
|
|
* Add support for Hygon Dhyana processors in the x86 backend,
|
|
thanks to Pu Wen for the patch.
|
|
* Fix symbol renaming to also rename internal components,
|
|
thanks to Evan Ramos for the patch.
|
|
* Fix build on HP-UX, thanks to Richard Lloyd for reporting the issues.
|
|
* Detect PCI link speed without being root on Linux >= 4.13.
|
|
* Add HWLOC_VERSION* macros to the public headers,
|
|
thanks to Gilles Gouaillardet for the suggestion.
|
|
|
|
|
|
Version 2.0.1 (also included in 1.11.10 when relevant)
|
|
-------------
|
|
* Bump the library soname to 15:0:0 to avoid conflicts with hwloc 1.11.x
|
|
releases. The hwloc 2.0.0 soname was buggy (12:0:0), applications will
|
|
have to be recompiled.
|
|
* Serialize pciaccess discovery to fix concurrent topology loads in
|
|
multiple threads.
|
|
* Fix hwloc-dump-hwdata to only process SMBIOS information that correspond
|
|
to the KNL and KNM configuration.
|
|
* Add a heuristic for guessing KNL/KNM memory and cluster modes when
|
|
hwloc-dump-hwdata could not run as root earlier.
|
|
* Add --no-text lstopo option to remove text from some boxes in the
|
|
graphical output. Mostly useful for removing Group labels.
|
|
* Some minor fixes to memory binding.
|
|
|
|
|
|
Version 2.0.0
|
|
-------------
|
|
*** The ABI of the library has changed. ***
|
|
For instance some hwloc_obj fields were reordered, added or removed, see below.
|
|
+ HWLOC_API_VERSION and hwloc_get_api_version() now give 0x00020000.
|
|
+ See "How do I handle ABI breaks and API upgrades ?" in the FAQ
|
|
and "Upgrading to hwloc 2.0 API" in the documentation.
|
|
* Major API changes
|
|
+ Memory, I/O and Misc objects are now stored in dedicated children lists,
|
|
not in the usual children list that is now only used for CPU-side objects.
|
|
- hwloc_get_next_child() may still be used to iterate over these 4 lists
|
|
of children at once.
|
|
- hwloc_obj_type_is_normal(), _memory() and _io() may be used to check
|
|
the kind of a given object type.
|
|
+ Topologies always have at least one NUMA object. On non-NUMA machines,
|
|
a single NUMA object is added to describe the entire machine memory.
|
|
The NUMA level cannot be ignored anymore.
|
|
+ The NUMA level is special since NUMA nodes are not in the main hierarchy
|
|
of objects anymore. Its depth is a fake negative depth that should not be
|
|
compared with normal levels.
|
|
- If all memory objects are attached to parents at the same depth,
|
|
it may be retrieved with hwloc_get_memory_parents_depth().
|
|
+ The HWLOC_OBJ_CACHE type is replaced with 8 types HWLOC_OBJ_L[1-5]CACHE
|
|
and HWLOC_OBJ_L[1-3]ICACHE that remove the need to disambiguate levels
|
|
when looking for caches with _by_type() functions.
|
|
- New hwloc_obj_type_is_{,d,i}cache() functions may be used to check whether
|
|
a given type is a cache.
|
|
+ Reworked ignoring/filtering API
|
|
- Replace hwloc_topology_ignore*() functions with hwloc_topology_set_type_filter()
|
|
and hwloc_topology_set_all_types_filter().
|
|
. Contrary to hwloc_topology_ignore_{type,all}_keep_structure() which
|
|
removed individual objects, HWLOC_TYPE_FILTER_KEEP_STRUCTURE only removes
|
|
entire levels (so that topology do not become too asymmetric).
|
|
- Remove HWLOC_TOPOLOGY_FLAG_ICACHES in favor of hwloc_topology_set_icache_types_filter()
|
|
with HWLOC_TYPE_FILTER_KEEP_ALL.
|
|
- Remove HWLOC_TOPOLOGY_FLAG_IO_DEVICES, _IO_BRIDGES and _WHOLE_IO in favor of
|
|
hwloc_topology_set_io_types_filter() with HWLOC_TYPE_FILTER_KEEP_ALL or
|
|
HWLOC_TYPE_FILTER_KEEP_IMPORTANT.
|
|
+ The distance API has been completely reworked. It is now described
|
|
in hwloc/distances.h.
|
|
+ Return values
|
|
- Most functions in hwloc/bitmap.h now return an int that may be negative
|
|
in case of failure to realloc/extend the internal storage of a bitmap.
|
|
- hwloc_obj_add_info() also returns an int in case allocations fail.
|
|
* Minor API changes
|
|
+ Object attributes
|
|
- obj->memory is removed.
|
|
. local_memory and page_types attributes are now in obj->attr->numanode
|
|
. total_memory moves obj->total_memory.
|
|
- Objects do not have allowed_cpuset and allowed_nodeset anymore.
|
|
They are only available for the entire topology using
|
|
hwloc_topology_get_allowed_cpuset() and hwloc_topology_get_allowed_nodeset().
|
|
- Objects now have a "subtype" field that supersedes former "Type" and
|
|
"CoProcType" info attributes.
|
|
+ Object and level depths are now signed ints.
|
|
+ Object string printing and parsing
|
|
- hwloc_type_sscanf() deprecates the old hwloc_obj_type_sscanf().
|
|
- hwloc_type_sscanf_as_depth() is added to convert a type name into
|
|
a level depth.
|
|
- hwloc_obj_cpuset_snprintf() is deprecated in favor of hwloc_bitmap_snprintf().
|
|
+ Misc objects
|
|
- Replace hwloc_topology_insert_misc_object_by_cpuset() with
|
|
hwloc_topology_insert_group_object() to precisely specify the location
|
|
of an additional hierarchy level in the topology.
|
|
- Misc objects have their own level and depth to iterate over all of them.
|
|
- Misc objects may now only be inserted as a leaf object with
|
|
hwloc_topology_insert_misc_object() which deprecates
|
|
hwloc_topology_insert_misc_object_by_parent().
|
|
+ hwloc_topology_restrict() doesn't remove objects that contain memory
|
|
by default anymore.
|
|
- The list of existing restrict flags was modified.
|
|
+ The discovery support array now contains some NUMA specific bits.
|
|
+ XML export functions take an additional flags argument,
|
|
for instance for exporting XMLs that are compatible with hwloc 1.x.
|
|
+ Functions diff_load_xml*(), diff_export_xml*() and diff_destroy() in
|
|
hwloc/diff.h do not need a topology as first parameter anymore.
|
|
+ hwloc_parse_cpumap_file () superseded by hwloc_linux_read_path_as_cpumask()
|
|
in hwloc/linux.h.
|
|
+ HWLOC_MEMBIND_DEFAULT and HWLOC_MEMBIND_FIRSTTOUCH were clarified.
|
|
* New APIs and Features
|
|
+ Add hwloc/shmem.h for sharing topologies between processes running on
|
|
the same machine (for reducing the memory footprint).
|
|
+ Add the experimental netloc subproject. It is disabled by default
|
|
and can be enabled with --enable-netloc.
|
|
It currently brings command-line tools to gather and visualize the
|
|
topology of InfiniBand fabrics, and an API to convert such topologies
|
|
into Scotch architectures for process mapping.
|
|
See the documentation for details.
|
|
* Removed APIs and features
|
|
+ Remove the online_cpuset from struct hwloc_obj. Offline PUs get unknown
|
|
topologies on Linux nowadays, and wrong topology on Solaris. Other OS
|
|
do not support them. And one cannot do much about them anyway. Just keep
|
|
them in complete_cpuset.
|
|
+ Remove the now-unused "System" object type HWLOC_OBJ_SYSTEM,
|
|
defined to MACHINE for backward compatibility.
|
|
+ The almost-unused "os_level" attribute has been removed from the
|
|
hwloc_obj structure.
|
|
+ Remove the custom interface for assembling the topologies of different
|
|
nodes as well as the hwloc-assembler tools.
|
|
+ hwloc_topology_set_fsroot() is removed, the environment variable
|
|
HWLOC_FSROOT may be used for the same remote testing/debugging purpose.
|
|
+ Remove the deprecated hwloc_obj_snprintf(), hwloc_obj_type_of_string(),
|
|
hwloc_distribute[v]().
|
|
* Remove Myrinet Express interoperability (hwloc/myriexpress.h).
|
|
+ Remove Kerrighed support from the Linux backend.
|
|
+ Remove Tru64 (OSF/1) support.
|
|
- Remove HWLOC_MEMBIND_REPLICATE which wasn't available anywhere else.
|
|
* Backend improvements
|
|
+ Linux
|
|
- OS devices do not have to be attached through PCI anymore,
|
|
for instance enabling the discovery of NVDIMM block devices.
|
|
- Remove the dependency on libnuma.
|
|
- Add a SectorSize attribute to block OS devices.
|
|
+ Mac OS X
|
|
- Fix detection of cores and hyperthreads.
|
|
- Add CPUVendor, Model, ... attributes.
|
|
+ Windows
|
|
- Add get_area_memlocation().
|
|
* Tools
|
|
+ lstopo and hwloc-info have a new --filter option matching the new filtering API.
|
|
+ lstopo can be given --children-order=plain to force a basic displaying
|
|
of memory and normal children together below their parent.
|
|
+ hwloc-distances was removed and replaced with lstopo --distances.
|
|
* Misc
|
|
+ Exports
|
|
- Exporting to synthetic now ignores I/O and Misc objects.
|
|
+ PCI discovery
|
|
- Separate OS device discovery from PCI discovery. Only the latter is disabled
|
|
with --disable-pci at configure time. Both may be disabled with --disable-io.
|
|
- The `linuxpci' component is now renamed into `linuxio'.
|
|
- The old `libpci' component name from hwloc 1.6 is not supported anymore,
|
|
only the `pci' name from hwloc 1.7 is now recognized.
|
|
- The HWLOC_PCI_<domain>_<bus>_LOCALCPUS environment variables are superseded
|
|
with a single HWLOC_PCI_LOCALITY where bus ranges may be specified.
|
|
- Do not set PCI devices and bridges name automatically. Vendor and device
|
|
names are already in info attributes.
|
|
+ Components and discovery
|
|
- Add HWLOC_SYNTHETIC environment variable to enforce a synthetic topology
|
|
as if hwloc_topology_set_synthetic() had been called.
|
|
- HWLOC_COMPONENTS doesn't support xml or synthetic component attributes
|
|
anymore, they should be passed in HWLOC_XMLFILE or HWLOC_SYNTHETIC instead.
|
|
- HWLOC_COMPONENTS takes precedence over other environment variables
|
|
for selecting components.
|
|
+ hwloc now requires a C99 compliant compiler.
|
|
|
|
|
|
Version 1.11.13 (also included in 2.0.4)
|
|
---------------
|
|
* Add support for Linux 5.3 new sysfs cpu topology files with Die information.
|
|
* Add support for Intel v2 Extended Topology Enumeration in the x86 backend.
|
|
* Tiles, Modules and Dies are exposed as Groups for now.
|
|
+ HWLOC_DONT_MERGE_DIE_GROUPS=1 may be set in the environment to prevent
|
|
Die groups from being automatically merged with identical parent or children.
|
|
* Ignore NUMA node information from AMD topoext in the x86 backend,
|
|
unless HWLOC_X86_TOPOEXT_NUMANODES=1 is set in the environment.
|
|
* Group objects have a new "dont_merge" attribute to prevent them from
|
|
being automatically merged with identical parent or children.
|
|
|
|
|
|
Version 1.11.12 (also included in 2.0.3)
|
|
---------------
|
|
* Fix a corner case of hwloc_topology_restrict() where children would
|
|
become out-of-order.
|
|
* Fix the return length of export_xmlbuffer() functions to always
|
|
include the ending \0.
|
|
|
|
|
|
Version 1.11.11 (also included in 2.0.2)
|
|
---------------
|
|
* Add support for Hygon Dhyana processors in the x86 backend,
|
|
thanks to Pu Wen for the patch.
|
|
* Fix symbol renaming to also rename internal components,
|
|
thanks to Evan Ramos for the patch.
|
|
* Fix build on HP-UX, thanks to Richard Lloyd for reporting the issues.
|
|
* Detect PCI link speed without being root on Linux >= 4.13.
|
|
|
|
|
|
Version 1.11.10 (also included in 2.0.1)
|
|
---------------
|
|
* Fix detection of cores and hyperthreads on Mac OS X.
|
|
* Serialize pciaccess discovery to fix concurrent topology loads in
|
|
multiple threads.
|
|
* Fix first touch area memory binding on Linux when thread memory
|
|
binding is different.
|
|
* Some minor fixes to memory binding.
|
|
* Fix hwloc-dump-hwdata to only process SMBIOS information that correspond
|
|
to the KNL and KNM configuration.
|
|
* Add a heuristic for guessing KNL/KNM memory and cluster modes when
|
|
hwloc-dump-hwdata could not run as root earlier.
|
|
* Fix discovery of NVMe OS devices on Linux >= 4.0.
|
|
* Add get_area_memlocation() on Windows.
|
|
* Add CPUVendor, Model, ... attributes on Mac OS X.
|
|
|
|
|
|
Version 1.11.9
|
|
--------------
|
|
* Add support for Zhaoxin ZX-C and ZX-D processors in the x86 backend,
|
|
thanks to Jeff Zhao for the patch.
|
|
* Fix AMD Epyc 24-core L3 cache locality in the x86 backend.
|
|
* Don't crash in the x86 backend when the CPUID vendor string is unknown.
|
|
* Fix the missing pu discovery support bit on some OS.
|
|
* Fix the management of the lstopoStyle info attribute for custom colors.
|
|
* Add verbose warnings when failing to load hwloc v2.0+ XMLs.
|
|
|
|
|
|
Version 1.11.8
|
|
--------------
|
|
* Multiple Solaris improvements, thanks to Maureen Chew for the help:
|
|
+ Detect caches on Sparc.
|
|
+ Properly detect allowed/disallowed PUs and NUMA nodes with processor sets.
|
|
+ Add hwloc_get_last_cpu_location() support for the current thread.
|
|
* Add support for CUDA compute capability 7.0 and fix support for 6.[12].
|
|
* Tools improvements
|
|
+ Fix search for objects by physical index in command-line tools.
|
|
+ Add missing "cpubind:get_thisthread_last_cpu_location" in the output
|
|
of hwloc-info --support.
|
|
+ Add --pid and --name to specify target processes in hwloc-ps.
|
|
+ Display thread names in lstopo and hwloc-ps on Linux.
|
|
* Doc improvements
|
|
+ Add a FAQ entry about building on Windows.
|
|
+ Install missing sub-manpage for hwloc_obj_add_info() and
|
|
hwloc_obj_get_info_by_name().
|
|
|
|
|
|
Version 1.11.7
|
|
--------------
|
|
* Fix hwloc-bind --membind for CPU-less NUMA nodes (again).
|
|
Thanks to Gilles Gouaillardet for reporting the issue.
|
|
* Fix a memory leak on IBM S/390 platforms running Linux.
|
|
* Fix a memory leak when forcing the x86 backend first on amd64/topoext
|
|
platforms running Linux.
|
|
* Command-line tools now support "hbm" instead "numanode" for filtering
|
|
only high-bandwidth memory nodes when selecting locations.
|
|
+ hwloc-bind also support --hbm and --no-hbm for filtering only or
|
|
no HBM nodes.
|
|
Thanks to Nicolas Denoyelle for the suggestion.
|
|
* Add --children and --descendants to hwloc-info for listing object
|
|
children or object descendants of a specific type.
|
|
* Add --no-index, --index, --no-attrs, --attrs to disable/enable display
|
|
of index numbers or attributes in the graphical lstopo output.
|
|
* Try to gather hwloc-dump-hwdata output from all possible locations
|
|
in hwloc-gather-topology.
|
|
* Updates to the documentation of locations in hwloc(7) and
|
|
command-line tools manpages.
|
|
|
|
|
|
Version 1.11.6
|
|
--------------
|
|
* Make the Linux discovery about twice faster, especially on the CPU side,
|
|
by trying to avoid sysfs file accesses as much as possible.
|
|
* Add support for AMD Family 17h processors (Zen) SMT cores in the Linux
|
|
and x86 backends.
|
|
* Add the HWLOC_TOPOLOGY_FLAG_THISSYSTEM_ALLOWED_RESOURCES flag (and the
|
|
HWLOC_THISSYSTEM_ALLOWED_RESOURCES environment variable) for reading the
|
|
set of allowed resources from the local operating system even if the
|
|
topology was loaded from XML or synthetic.
|
|
* Fix hwloc_bitmap_set/clr_range() for infinite ranges that do not
|
|
overlap currently defined ranges in the bitmap.
|
|
* Don't reset the lstopo zoom scale when moving the X11 window.
|
|
* lstopo now has --flags for manually setting topology flags.
|
|
* hwloc_get_depth_type() returns HWLOC_TYPE_DEPTH_UNKNOWN for Misc objects.
|
|
|
|
|
|
Version 1.11.5
|
|
--------------
|
|
* Add support for Knights Mill Xeon Phi, thanks to Piotr Luc for the patch.
|
|
* Reenable distance gathering on Solaris, disabled by mistake since v1.0.
|
|
Thanks to TU Wien for the help.
|
|
* Fix hwloc_get_*obj*_inside_cpuset() functions to ignore objects with
|
|
empty CPU sets, for instance, CPU-less NUMA nodes such as KNL MCDRAM.
|
|
Thanks to Nicolas Denoyelle for the report.
|
|
* Fix XML import of multiple distance matrices.
|
|
* Add a FAQ entry about "hwloc is only a structural model, it ignores
|
|
performance models, memory bandwidth, etc.?"
|
|
|
|
|
|
Version 1.11.4
|
|
--------------
|
|
* Add MemoryMode and ClusterMode attributes in the Machine object on KNL.
|
|
Add doc/examples/get-knl-modes.c for an example of retrieving them.
|
|
Thanks to Grzegorz Andrejczuk.
|
|
* Fix Linux build with -m32 with respect to libudev.
|
|
Thanks to Paul Hargrove for reporting the issue.
|
|
* Fix build with Visual Studio 2015, thanks to Eloi Gaudry for reporting
|
|
the issue and providing the patch.
|
|
* Don't forget to display OS device children in the graphical lstopo.
|
|
* Fix a memory leak on Solaris, thanks to Bryon Gloden for the patch.
|
|
* Properly handle realloc() failures, thanks to Bryon Gloden for reporting
|
|
the issue.
|
|
* Fix lstopo crash in ascii/fig/windows outputs when some objects have a
|
|
lstopoStyle info attribute.
|
|
|
|
|
|
Version 1.11.3
|
|
--------------
|
|
* Bug fixes
|
|
+ Fix a memory leak on Linux S/390 hosts with books.
|
|
+ Fix /proc/mounts parsing on Linux by using mntent.h.
|
|
Thanks to Nathan Hjelm for reporting the issue.
|
|
+ Fix a x86 infinite loop on VMware due to the x2APIC feature being
|
|
advertised without actually being fully supported.
|
|
Thanks to Jianjun Wen for reporting the problem and testing the patch.
|
|
+ Fix the return value of hwloc_alloc() on mmap() failure.
|
|
Thanks to Hugo Brunie for reporting the issue.
|
|
+ Fix the return value of command-line tools in some error cases.
|
|
+ Do not break individual thread bindings during x86 backend discovery in a
|
|
multithreaded process. Thanks to Farouk Mansouri for the report.
|
|
+ Fix hwloc-bind --membind for CPU-less NUMA nodes.
|
|
+ Fix some corner cases in the XML export/import of application userdata.
|
|
* API Improvements
|
|
+ Add HWLOC_MEMBIND_BYNODESET flag so that membind() functions accept
|
|
either cpusets or nodesets.
|
|
+ Add hwloc_get_area_memlocation() to check where pages are actually
|
|
allocated. Only implemented on Linux for now.
|
|
- There's no _nodeset() variant, but the new flag HWLOC_MEMBIND_BYNODESET
|
|
is supported.
|
|
+ Make hwloc_obj_type_sscanf() parse back everything that may be outputted
|
|
by hwloc_obj_type_snprintf().
|
|
* Detection Improvements
|
|
+ Allow the x86 backend to add missing cache levels, so that it completes
|
|
what the Solaris backend lacks.
|
|
Thanks to Ryan Zezeski for reporting the issue.
|
|
+ Do not filter-out FibreChannel PCI adapters by default anymore.
|
|
Thanks to Matt Muggeridge for the report.
|
|
+ Add support for CUDA compute capability 6.x.
|
|
* Tools
|
|
+ Add --support to hwloc-info to list supported features, just like with
|
|
hwloc_topology_get_support().
|
|
- Also add --objects and --topology to explicitly switch between the
|
|
default modes.
|
|
+ Add --tid to let hwloc-bind operate on individual threads on Linux.
|
|
+ Add --nodeset to let hwloc-bind report memory binding as NUMA node sets.
|
|
+ hwloc-annotate and lstopo don't drop application userdata from XMLs anymore.
|
|
- Add --cu to hwloc-annotate to drop these application userdata.
|
|
+ Make the hwloc-dump-hwdata dump directory configurable through configure
|
|
options such as --runstatedir or --localstatedir.
|
|
* Misc Improvements
|
|
+ Add systemd service template contrib/systemd/hwloc-dump-hwdata.service
|
|
for launching hwloc-dump-hwdata at boot on Linux.
|
|
Thanks to Grzegorz Andrejczuk.
|
|
+ Add HWLOC_PLUGINS_BLACKLIST environment variable to prevent some plugins
|
|
from being loaded. Thanks to Alexandre Denis for the suggestion.
|
|
+ Small improvements for various Windows build systems,
|
|
thanks to Jonathan L Peyton and Marco Atzeri.
|
|
|
|
|
|
Version 1.11.2
|
|
--------------
|
|
* Improve support for Intel Knights Landing Xeon Phi on Linux:
|
|
+ Group local NUMA nodes of normal memory (DDR) and high-bandwidth memory
|
|
(MCDRAM) together through "Cluster" groups so that the local MCDRAM is
|
|
easy to find.
|
|
- See "How do I find the local MCDRAM NUMA node on Intel Knights
|
|
Landing Xeon Phi?" in the documentation.
|
|
- For uniformity across all KNL configurations, always have a NUMA node
|
|
object even if the host is UMA.
|
|
+ Fix the detection of the memory-side cache:
|
|
- Add the hwloc-dump-hwdata superuser utility to dump SMBIOS information
|
|
into /var/run/hwloc/ as root during boot, and load this dumped
|
|
information from the hwloc library at runtime.
|
|
- See "Why do I need hwloc-dump-hwdata for caches on Intel Knights
|
|
Landing Xeon Phi?" in the documentation.
|
|
Thanks to Grzegorz Andrejczuk for the patches and for the help.
|
|
* The x86 and linux backends may now be combined for discovering CPUs
|
|
through x86 CPUID and memory from the Linux kernel.
|
|
This is useful for working around buggy CPU information reported by Linux
|
|
(for instance the AMD Bulldozer/Piledriver bug below).
|
|
Combination is enabled by passing HWLOC_COMPONENTS=x86 in the environment.
|
|
* Fix L3 cache sharing on AMD Opteron 63xx (Piledriver) and 62xx (Bulldozer)
|
|
in the x86 backend. Thanks to many users who helped.
|
|
* Fix the overzealous L3 cache sharing fix added to the x86 backend in 1.11.1
|
|
for AMD Opteron 61xx (Magny-Cours) processors.
|
|
* The x86 backend may now add the info attribute Inclusive=0 or 1 to caches
|
|
it discovers, or to caches discovered by other backends earlier.
|
|
Thanks to Guillaume Beauchamp for the patch.
|
|
* Fix the management on alloc_membind() allocation failures on AIX, HP-UX
|
|
and OSF/Tru64.
|
|
* Fix spurious failures to load with ENOMEM on AIX in case of Misc objects
|
|
below PUs.
|
|
* lstopo improvements in X11 and Windows graphical mode:
|
|
+ Add + - f 1 shortcuts to manually zoom-in, zoom-out, reset the scale,
|
|
or fit the entire window.
|
|
+ Display all keyboard shortcuts in the console.
|
|
* Debug messages may be disabled at runtime by passing HWLOC_DEBUG_VERBOSE=0
|
|
in the environment when --enable-debug was passed to configure.
|
|
* Add a FAQ entry "What are these Group objects in my topology?".
|
|
|
|
|
|
Version 1.11.1
|
|
--------------
|
|
* Detection fixes
|
|
+ Hardwire the topology of Fujitsu K-computer, FX10, FX100 servers to
|
|
workaround buggy Linux kernels.
|
|
Thanks to Takahiro Kawashima and Gilles Gouaillardet.
|
|
+ Fix L3 cache information on AMD Opteron 61xx Magny-Cours processors
|
|
in the x86 backend. Thanks to Guillaume Beauchamp for the patch.
|
|
+ Detect block devices directly attached to PCI without a controller,
|
|
for instance NVMe disks. Thanks to Barry M. Tannenbaum.
|
|
+ Add the PCISlot attribute to all PCI functions instead of only the
|
|
first one.
|
|
* Miscellaneous internal fixes
|
|
+ Ignore PCI bridges that could fail assertions by reporting buggy
|
|
secondary-subordinate bus numbers
|
|
Thanks to George Bosilca for reporting the issue.
|
|
+ Fix an overzealous assertion when inserting an intermediate Group object
|
|
while Groups are totally ignored.
|
|
+ Fix a memory leak on Linux on AMD processors with dual-core compute units.
|
|
Thanks to Bob Benner.
|
|
+ Fix a memory leak on failure to load a xml diff file.
|
|
+ Fix some segfaults when inputting an invalid synthetic description.
|
|
+ Fix a segfault when plugins fail to find core symbols.
|
|
Thanks to Guy Streeter.
|
|
* Many fixes and improvements in the Windows backend:
|
|
+ Fix the discovery of more than 32 processors and multiple processor
|
|
groups. Thanks to Barry M. Tannenbaum for the help.
|
|
+ Add thread binding set support in case of multiple process groups.
|
|
+ Add thread binding get support.
|
|
+ Add get_last_cpu_location() support for the current thread.
|
|
+ Disable the unsupported process binding in case of multiple processor
|
|
groups.
|
|
+ Fix/update the Visual Studio support under contrib/windows.
|
|
Thanks to Eloi Gaudry for the help.
|
|
* Tools fixes
|
|
+ Fix a segfault when displaying logical indexes in the graphical lstopo.
|
|
Thanks to Guillaume Mercier for reporting the issue.
|
|
+ Fix lstopo linking with X11 libraries, for instance on Mac OS X.
|
|
Thanks to Scott Atchley and Pierre Ramet for reporting the issue.
|
|
+ hwloc-annotate, hwloc-diff and hwloc-patch do not drop unavailable
|
|
resources from the output anymore and those may be annotated as well.
|
|
+ Command-line tools may now import XML from the standard input with -i -.xml
|
|
+ Add missing documentation for the hwloc-info --no-icaches option.
|
|
|
|
|
|
Version 1.11.0
|
|
--------------
|
|
* API
|
|
+ Socket objects are renamed into Package to align with the terminology
|
|
used by processor vendors. The old HWLOC_OBJ_SOCKET type and "Socket"
|
|
name are still supported for backward compatibility.
|
|
+ HWLOC_OBJ_NODE is replaced with HWLOC_OBJ_NUMANODE for clarification.
|
|
HWLOC_OBJ_NODE is still supported for backward compatibility.
|
|
"Node" and "NUMANode" strings are supported as in earlier releases.
|
|
* Detection improvements
|
|
+ Add support for Intel Knights Landing Xeon Phi.
|
|
Thanks to Grzegorz Andrejczuk and Lukasz Anaczkowski.
|
|
+ Add Vendor, Model, Revision, SerialNumber, Type and LinuxDeviceID
|
|
info attributes to Block OS devices on Linux. Thanks to Vineet Pedaballe
|
|
for the help.
|
|
- Add --disable-libudev to avoid dependency on the libudev library.
|
|
+ Add "MemoryModule" Misc objects with information about DIMMs, on Linux
|
|
when privileged and when I/O is enabled.
|
|
Thanks to Vineet Pedaballe for the help.
|
|
+ Add a PCISlot attribute to PCI devices on Linux when supported to
|
|
identify the physical PCI slot where the board is plugged.
|
|
+ Add CPUStepping info attribute on x86 processors,
|
|
thanks to Thomas Röhl for the suggestion.
|
|
+ Ignore the device-tree on non-Power architectures to avoid buggy
|
|
detection on ARM. Thanks to Orion Poplawski for reporting the issue.
|
|
+ Work-around buggy Xeon E5v3 BIOS reporting invalid PCI-NUMA affinity
|
|
for the PCI links on the second processor.
|
|
+ Add support for CUDA compute capability 5.x, thanks Benjamin Worpitz.
|
|
+ Many fixes to the x86 backend
|
|
- Add L1i and fix L2/L3 type on old AMD processors without topoext support.
|
|
- Fix Intel CPU family and model numbers when basic family isn't 6 or 15.
|
|
- Fix package IDs on recent AMD processors.
|
|
- Fix misc issues due to incomplete APIC IDs on x2APIC processors.
|
|
- Avoid buggy discovery on old SGI Altix UVs with non-unique APIC IDs.
|
|
+ Gather total machine memory on NetBSD.
|
|
* Tools
|
|
+ lstopo
|
|
- Collapse identical PCI devices unless --no-collapse is given.
|
|
This avoids gigantic outputs when a PCI device contains dozens of
|
|
identical virtual functions.
|
|
- The ASCII art output is now called "ascii", for instance in
|
|
"lstopo -.ascii".
|
|
The former "txt" extension is retained for backward compatibility.
|
|
- Automatically scales graphical box width to the inner text in Cairo,
|
|
ASCII and Windows outputs.
|
|
- Add --rect to lstopo to force rectangular layout even for NUMA nodes.
|
|
- Add --restrict-flags to configure the behavior of --restrict.
|
|
- Objects may have a "Type" info attribute to specify a better type name
|
|
and display it in lstopo.
|
|
- Really export all verbose information to the given output file.
|
|
+ hwloc-annotate
|
|
- May now operate on all types of objects, including I/O.
|
|
- May now insert Misc objects in the topology.
|
|
- Do not drop instruction caches and I/O devices from the output anymore.
|
|
+ Fix lstopo path in hwloc-gather-topology after install.
|
|
* Misc
|
|
+ Fix hwloc/cudart.h for machines with multiple PCI domains,
|
|
thanks to Imre Kerr for reporting the problem.
|
|
+ Fix PCI Bridge-specific depth attribute.
|
|
+ Fix hwloc_bitmap_intersect() for two infinite bitmaps.
|
|
+ Fix some corner cases in the building of levels on large NUMA machines
|
|
with non-uniform NUMA groups and I/Os.
|
|
+ Improve the performance of object insertion by cpuset for large
|
|
topologies.
|
|
+ Prefix verbose XML import errors with the source name.
|
|
+ Improve pkg-config checks and error messages.
|
|
+ Fix excluding after a component with an argument in the HWLOC_COMPONENTS
|
|
environment variable.
|
|
* Documentation
|
|
+ Fix the recommended way in documentation and examples to allocate memory
|
|
on some node, it should use HWLOC_MEMBIND_BIND.
|
|
Thanks to Nicolas Bouzat for reporting the issue.
|
|
+ Add a "Miscellaneous objects" section in the documentation.
|
|
+ Add a FAQ entry "What happens to my topology if I disable symmetric
|
|
multithreading, hyper-threading, etc. ?" to the documentation.
|
|
|
|
|
|
Version 1.10.1
|
|
--------------
|
|
* Actually remove disallowed NUMA nodes from nodesets when the whole-system
|
|
flag isn't enabled.
|
|
* Fix the gathering of PCI domains. Thanks to James Custer for reporting
|
|
the issue and providing a patch.
|
|
* Fix the merging of identical parent and child in presence of Misc objects.
|
|
Thanks to Dave Love for reporting the issue.
|
|
* Fix some misordering of children when merging with ignore_keep_structure()
|
|
in partially allowed topologies.
|
|
* Fix an overzealous assertion in the debug code when running on a single-PU
|
|
host with I/O. Thanks to Thomas Van Doren for reporting the issue.
|
|
* Don't forget to setup NUMA node object nodesets in x86 backend (for BSDs)
|
|
and OSF/Tru64 backend.
|
|
* Fix cpuid-x86 build error with gcc -O3 on x86-32. Thanks to Thomas Van Doren
|
|
for reporting the issue.
|
|
* Fix support for future very large caches in the x86 backend.
|
|
* Fix vendor/device names for SR-IOV PCI devices on Linux.
|
|
* Fix an unlikely crash in case of buggy hierarchical distance matrix.
|
|
* Fix PU os_index on some AIX releases. Thanks to Hendryk Bockelmann and
|
|
Erik Schnetter for helping debugging.
|
|
* Fix hwloc_bitmap_isincluded() in case of infinite sets.
|
|
* Change hwloc-ls.desktop into a lstopo.desktop and only install it if
|
|
lstopo is built with Cairo/X11 support. It cannot work with a non-graphical
|
|
lstopo or hwloc-ls.
|
|
* Add support for the renaming of Socket into Package in future releases.
|
|
* Add support for the replacement of HWLOC_OBJ_NODE with HWLOC_OBJ_NUMANODE
|
|
in future releases.
|
|
* Clarify the documentation of distance matrices in hwloc.h and in the manpage
|
|
of the hwloc-distances. Thanks to Dave Love for the suggestion.
|
|
* Improve some error messages by displaying more information about the
|
|
hwloc library in use.
|
|
* Document how to deal with the ABI break when upgrading to the upcoming 2.0
|
|
See "How do I handle ABI breaks and API upgrades ?" in the FAQ.
|
|
|
|
|
|
Version 1.10.0
|
|
--------------
|
|
* API
|
|
+ Add hwloc_topology_export_synthetic() to export a topology to a
|
|
synthetic string without using lstopo. See the Synthetic topologies
|
|
section in the documentation.
|
|
+ Add hwloc_topology_set/get_userdata() to let the application save
|
|
a private pointer in the topology whenever it needs a way to find
|
|
its own object corresponding to a topology.
|
|
+ Add hwloc_get_numanode_obj_by_os_index() and document that this function
|
|
as well as hwloc_get_pu_obj_by_os_index() are good at converting
|
|
nodesets and cpusets into objects.
|
|
+ hwloc_distrib() does not ignore any objects anymore when there are
|
|
too many of them. They get merged with others instead.
|
|
Thanks to Tim Creech for reporting the issue.
|
|
* Tools
|
|
+ hwloc-bind --get <command-line> now executes the command after displaying
|
|
the binding instead of ignoring the command entirely.
|
|
Thanks to John Donners for the suggestion.
|
|
+ Clarify that memory sizes shown in lstopo are local by default
|
|
unless specified (total memory added in the root object).
|
|
* Synthetic topologies
|
|
+ Synthetic topology descriptions may now specify attributes such as
|
|
memory sizes and OS indexes. See the Synthetic topologies section
|
|
in the documentation.
|
|
+ lstopo now exports in this fully-detailed format by default.
|
|
The new option --export-synthetic-flags may be used to revert
|
|
back the old format.
|
|
* Documentation
|
|
+ Add the doc/examples/ subdirectory with several real-life examples,
|
|
including the already existing hwloc-hello.C for basics.
|
|
Thanks to Rob Aulwes for the suggestion.
|
|
+ Improve the documentation of CPU and memory binding in the API.
|
|
+ Add a FAQ entry about operating system errors, especially on AMD
|
|
platforms with buggy cache information.
|
|
+ Add a FAQ entry about loading many topologies in a single program.
|
|
* Misc
|
|
+ Work around buggy Linux kernels reporting 2 sockets instead
|
|
1 socket with 2 NUMA nodes for each Xeon E5 v3 (Haswell) processor.
|
|
+ pciutils/libpci support is now removed since libpciaccess works
|
|
well and there's also a Linux-specific PCI backend. For the record,
|
|
pciutils was GPL and therefore disabled by default since v1.6.2.
|
|
+ Add --disable-cpuid configure flag to work around buggy processor
|
|
simulators reporting invalid CPUID information.
|
|
Thanks for Andrew Friedley for reporting the issue.
|
|
+ Fix a racy use of libltdl when manipulating multiple topologies in
|
|
different threads.
|
|
Thanks to Andra Hugo for reporting the issue and testing patches.
|
|
+ Fix some build failures in private/misc.h.
|
|
Thanks to Pavan Balaji and Ralph Castain for the reports.
|
|
+ Fix failures to detect X11/Xutil.h on some Solaris platforms.
|
|
Thanks to Siegmar Gross for reporting the failure.
|
|
+ The plugin ABI has changed, this release will not load plugins
|
|
built against previous hwloc releases.
|
|
|
|
|
|
Version 1.9.1
|
|
-------------
|
|
* Fix a crash when the PCI locality is invalid. Attach to the root object
|
|
instead. Thanks to Nicolas Denoyelle for reporting the issue.
|
|
* Fix -f in lstopo manpage. Thanks to Jirka Hladky for reporting the issue.
|
|
* Fix hwloc_obj_type_sscanf() and others when strncasecmp() is not properly
|
|
available. Thanks to Nick Papior Andersen for reporting the problem.
|
|
* Mark Linux file descriptors as close-on-exec to avoid leaks on exec.
|
|
* Fix some minor memory leaks.
|
|
|
|
|
|
Version 1.9.0
|
|
-------------
|
|
* API
|
|
+ Add hwloc_obj_type_sscanf() to extend hwloc_obj_type_of_string() with
|
|
type-specific attributes such as Cache/Group depth and Cache type.
|
|
hwloc_obj_type_of_string() is moved to hwloc/deprecated.h.
|
|
+ Add hwloc_linux_get_tid_last_cpu_location() for retrieving the
|
|
last CPU where a Linux thread given by TID ran.
|
|
+ Add hwloc_distrib() to extend the old hwloc_distribute[v]() functions.
|
|
hwloc_distribute[v]() is moved to hwloc/deprecated.h.
|
|
+ Don't mix total and local memory when displaying verbose object attributes
|
|
with hwloc_obj_attr_snprintf() or in lstopo.
|
|
* Backends
|
|
+ Add CPUVendor, CPUModelNumber and CPUFamilyNumber info attributes for
|
|
x86, ia64 and Xeon Phi sockets on Linux, to extend the x86-specific
|
|
support added in v1.8.1. Requested by Ralph Castain.
|
|
+ Add many CPU- and Platform-related info attributes on ARM and POWER
|
|
platforms, in the Machine and Socket objects.
|
|
+ Add CUDA info attributes describing the number of multiprocessors and
|
|
cores and the size of the global, shared and L2 cache memories in CUDA
|
|
OS devices.
|
|
+ Add OpenCL info attributes describing the number of compute units and
|
|
the global memory size in OpenCL OS devices.
|
|
+ The synthetic backend now accepts extended types such as L2Cache, L1i or
|
|
Group3. lstopo also exports synthetic strings using these extended types.
|
|
* Tools
|
|
+ lstopo
|
|
- Do not overwrite output files by default anymore.
|
|
Pass -f or --force to enforce it.
|
|
- Display OpenCL, CUDA and Xeon Phi numbers of cores and memory sizes
|
|
in the graphical output.
|
|
- Fix export to stdout when specifying a Cairo-based output type
|
|
with --of.
|
|
+ hwloc-ps
|
|
- Add -e or --get-last-cpu-location to report where processes/threads
|
|
run instead of where they are bound.
|
|
- Report locations as likely-more-useful objects such as Cores or Sockets
|
|
instead of Caches when possible.
|
|
+ hwloc-bind
|
|
- Fix failure on Windows when not using --pid.
|
|
- Add -e as a synonym to --get-last-cpu-location.
|
|
+ hwloc-distrib
|
|
- Add --reverse to distribute using last objects first and singlify
|
|
into last bits first. Thanks to Jirka Hladky for the suggestion.
|
|
+ hwloc-info
|
|
- Report unified caches when looking for data or instruction cache
|
|
ancestor objects.
|
|
* Misc
|
|
+ Add experimental Visual Studio support under contrib/windows.
|
|
Thanks to Eloi Gaudry for his help and for providing the first draft.
|
|
+ Fix some overzealous assertions and warnings about the ordering of
|
|
objects on a level with respect to cpusets. The ordering is only
|
|
guaranteed for complete cpusets (based on the first bit in sets).
|
|
+ Fix some memory leaks when importing xml diffs and when exporting a
|
|
"too complex" entry.
|
|
|
|
|
|
Version 1.8.1
|
|
-------------
|
|
* Fix the cpuid code on Windows 64bits so that the x86 backend gets
|
|
enabled as expected and can populate CPU information.
|
|
Thanks to Robin Scher for reporting the problem.
|
|
* Add CPUVendor/CPUModelNumber/CPUFamilyNumber attributes when running
|
|
on x86 architecture. Thanks to Ralph Castain for the suggestion.
|
|
* Work around buggy BIOS reporting duplicate NUMA nodes on Linux.
|
|
Thanks to Jeff Becker for reporting the problem and testing the patch.
|
|
* Add a name to the lstopo graphical window. Thanks to Michael Prokop
|
|
for reporting the issue.
|
|
|
|
|
|
Version 1.8.0
|
|
-------------
|
|
* New components
|
|
+ Add the "linuxpci" component that always works on Linux even when
|
|
libpciaccess and libpci aren't available (and even with a modified
|
|
file-system root). By default the old "pci" component runs first
|
|
because "linuxpci" lacks device names (obj->name is always NULL).
|
|
* API
|
|
+ Add the topology difference API in hwloc/diff.h for manipulating
|
|
many similar topologies.
|
|
+ Add hwloc_topology_dup() for duplicating an entire topology.
|
|
+ hwloc.h and hwloc/helper.h have been reorganized to clarify the
|
|
documentation sections. The actual inline code has moved out of hwloc.h
|
|
into the new hwloc/inlines.h.
|
|
+ Deprecated functions are now in hwloc/deprecated.h, and not in the
|
|
official documentation anymore.
|
|
* Tools
|
|
+ Add hwloc-diff and hwloc-patch tools together with the new diff API.
|
|
+ Add hwloc-compress-dir to (de)compress an entire directory of XML files
|
|
using hwloc-diff and hwloc-patch.
|
|
+ Object colors in the graphical output of lstopo may be changed by adding
|
|
a "lstopoStyle" info attribute. See CUSTOM COLORS in the lstopo(1) manpage
|
|
for details. Thanks to Jirka Hladky for discussing the idea.
|
|
+ hwloc-gather-topology may now gather I/O-related files on Linux when
|
|
--io is given. Only the linuxpci component supports discovering I/O
|
|
objects from these extended tarballs.
|
|
+ hwloc-annotate now supports --ri to remove/replace info attributes with
|
|
a given name.
|
|
+ hwloc-info supports "root" and "all" special locations for dumping
|
|
information about the root object.
|
|
+ lstopo now supports --append-legend to append custom lines of text
|
|
to the legend in the graphical output. Thanks to Jirka Hladky for
|
|
discussing the idea.
|
|
+ hwloc-calc and friends have a more robust parsing of locations given
|
|
on the command-line and they report useful error messages about it.
|
|
+ Add --whole-system to hwloc-bind, hwloc-calc, hwloc-distances and
|
|
hwloc-distrib, and add --restrict to hwloc-bind for uniformity among
|
|
tools.
|
|
* Misc
|
|
+ Calling hwloc_topology_load() or hwloc_topology_set_*() on an already
|
|
loaded topology now returns an error (deprecated since release 1.6.1).
|
|
+ Fix the initialisation of cpusets and nodesets in Group objects added
|
|
when inserting PCI hostbridges.
|
|
+ Never merge Group objects that were added explicitly by the user with
|
|
hwloc_custom_insert_group_object_by_parent().
|
|
+ Add a sanity check during dynamic plugin loading to prevent some
|
|
crashes when hwloc is dynamically loaded by another plugin mechanisms.
|
|
+ Add --with-hwloc-plugins-path to specify the install/load directories
|
|
of plugins.
|
|
+ Add the MICSerialNumber info attribute to the root object when running
|
|
hwloc inside a Xeon Phi to match the same attribute in the MIC OS device
|
|
when running in the host.
|
|
|
|
|
|
Version 1.7.2
|
|
-------------
|
|
* Do not create invalid block OS devices on very old Linux kernel such
|
|
as RHEL4 2.6.9.
|
|
* Fix PCI subvendor/device IDs.
|
|
* Fix the management of Misc objects inserted by parent.
|
|
Thanks to Jirka Hladky for reporting the problem.
|
|
* Add a Port<n>State into attribute to OpenFabrics OS devices.
|
|
* Add a MICSerialNumber info attribute to Xeon PHI/MIC OS devices.
|
|
* Improve verbose error messages when failing to load from XML.
|
|
|
|
|
|
Version 1.7.1
|
|
-------------
|
|
* Fix a failed assertion in the distance grouping code when loading a XML
|
|
file that already contains some groups.
|
|
Thanks to Laercio Lima Pilla for reporting the problem.
|
|
* Remove unexpected Group objects when loading XML topologies with I/O
|
|
objects and NUMA distances.
|
|
Thanks to Elena Elkina for reporting the problem and testing patches.
|
|
* Fix PCI link speed discovery when using libpciaccess.
|
|
* Fix invalid libpciaccess virtual function device/vendor IDs when using
|
|
SR-IOV PCI devices on Linux.
|
|
* Fix GL component build with old NVCtrl releases.
|
|
Thanks to Jirka Hladky for reporting the problem.
|
|
* Fix embedding breakage caused by libltdl.
|
|
Thanks to Pavan Balaji for reporting the problem.
|
|
* Always use the system-wide libltdl instead of shipping one inside hwloc.
|
|
* Document issues when enabling plugins while embedding hwloc in another
|
|
project, in the documentation section Embedding hwloc in Other Software.
|
|
* Add a FAQ entry "How to get useful topology information on NetBSD?"
|
|
in the documentation.
|
|
* Somes fixes in the renaming code for embedding.
|
|
* Miscellaneous minor build fixes.
|
|
|
|
|
|
Version 1.7.0
|
|
-------------
|
|
* New operating system backends
|
|
+ Add BlueGene/Q compute node kernel (CNK) support. See the FAQ in the
|
|
documentation for details. Thanks to Jeff Hammond, Christopher Samuel
|
|
and Erik Schnetter for their help.
|
|
+ Add NetBSD support, thanks to Aleksej Saushev.
|
|
* New I/O device discovery
|
|
+ Add co-processor OS devices such as "mic0" for Intel Xeon Phi (MIC)
|
|
on Linux. Thanks to Jerome Vienne for helping.
|
|
+ Add co-processor OS devices such as "cuda0" for NVIDIA CUDA-capable GPUs.
|
|
+ Add co-processor OS devices such as "opencl0d0" for OpenCL GPU devices
|
|
on the AMD OpenCL implementation.
|
|
+ Add GPU OS devices such as ":0.0" for NVIDIA X11 displays.
|
|
+ Add GPU OS devices such as "nvml0" for NVIDIA GPUs.
|
|
Thanks to Marwan Abdellah and Stefan Eilemann for helping.
|
|
These new OS devices have some string info attributes such as CoProcType,
|
|
GPUModel, etc. to better identify them.
|
|
See the I/O Devices and Attributes documentation sections for details.
|
|
* New components
|
|
+ Add the "opencl", "cuda", "nvml" and "gl" components for I/O device
|
|
discovery.
|
|
+ "nvml" also improves the discovery of NVIDIA GPU PCIe link speed.
|
|
All of these new components may be built as plugins. They may also be
|
|
disabled entirely by passing --disable-opencl/cuda/nvml/gl to configure.
|
|
See the I/O Devices, Components and Plugins, and FAQ documentation
|
|
sections for details.
|
|
* API
|
|
+ Add hwloc_topology_get_flags().
|
|
+ Add hwloc/plugins.h for building external plugins.
|
|
See the Adding new discovery components and plugins section.
|
|
* Interoperability
|
|
+ Add hwloc/opencl.h, hwloc/nvml.h, hwloc/gl.h and hwloc/intel-mic.h
|
|
to retrieve the locality of OS devices that correspond to AMD OpenCL
|
|
GPU devices or indexes, to NVML devices or indexes, to NVIDIA X11
|
|
displays, or to Intel Xeon Phi (MIC) device indexes.
|
|
+ Add new helpers in hwloc/cuda.h and hwloc/cudart.h to convert
|
|
between CUDA devices or indexes and hwloc OS devices.
|
|
+ Add hwloc_ibv_get_device_osdev() and clarify the requirements
|
|
of the OpenFabrics Verbs helpers in hwloc/openfabrics-verbs.h.
|
|
* Tools
|
|
+ hwloc-info is not only a synonym of lstopo -s anymore, it also
|
|
dumps information about objects given on the command-line.
|
|
* Documentation
|
|
+ Add a section "Existing components and plugins".
|
|
+ Add a list of common OS devices in section "Software devices".
|
|
+ Add a new FAQ entry "Why is lstopo slow?" about lstopo slowness
|
|
issues because of GPUs.
|
|
+ Clarify the documentation of inline helpers in hwloc/myriexpress.h
|
|
and hwloc/openfabrics-verbs.h.
|
|
* Misc
|
|
+ Improve cache detection on AIX.
|
|
+ The HWLOC_COMPONENTS variable now excludes the components whose
|
|
names are prefixed with '-'.
|
|
+ lstopo --ignore PU now works when displaying the topology in
|
|
graphical and textual mode (not when exporting to XML).
|
|
+ Make sure I/O options always appear in lstopo usage, not only when
|
|
using pciutils/libpci.
|
|
+ Remove some unneeded Linux specific includes from some interoperability
|
|
headers.
|
|
+ Fix some inconsistencies in hwloc-distrib and hwloc-assembler-remote
|
|
manpages. Thanks to Guy Streeter for the report.
|
|
+ Fix a memory leak on AIX when getting memory binding.
|
|
+ Fix many small memory leaks on Linux.
|
|
+ The `libpci' component is now called `pci' but the old name is still
|
|
accepted in the HWLOC_COMPONENTS variable for backward compatibility.
|
|
|
|
|
|
Version 1.6.2
|
|
-------------
|
|
* Use libpciaccess instead of pciutils/libpci by default for I/O discovery.
|
|
pciutils/libpci is only used if --enable-libpci is given to configure
|
|
because its GPL license may taint hwloc. See the Installation section
|
|
in the documentation for details.
|
|
* Fix get_cpubind on Solaris when bound to a single PU with
|
|
processor_bind(). Thanks to Eugene Loh for reporting the problem
|
|
and providing a patch.
|
|
|
|
|
|
Version 1.6.1
|
|
-------------
|
|
* Fix some crash or buggy detection in the x86 backend when Linux
|
|
cgroups/cpusets restrict the available CPUs.
|
|
* Fix the pkg-config output with --libs --static.
|
|
Thanks to Erik Schnetter for reporting one of the problems.
|
|
* Fix the output of hwloc-calc -H --hierarchical when using logical
|
|
indexes in the output.
|
|
* Calling hwloc_topology_load() multiple times on the same topology
|
|
is officially deprecated. hwloc will warn in such cases.
|
|
* Add some documentation about existing plugins/components, package
|
|
dependencies, and I/O devices specification on the command-line.
|
|
|
|
|
|
Version 1.6.0
|
|
-------------
|
|
* Major changes
|
|
+ Reorganize the backend infrastructure to support dynamic selection
|
|
of components and dynamic loading of plugins. For details, see the
|
|
new documentation section Components and plugins.
|
|
- The HWLOC_COMPONENTS variable lets one replace the default discovery
|
|
components.
|
|
- Dynamic loading of plugins may be enabled with --enable-plugins
|
|
(except on AIX and Windows). It will build libxml2 and libpci
|
|
support as separated modules. This helps reducing the dependencies
|
|
of the core hwloc library when distributed as a binary package.
|
|
* Backends
|
|
+ Add CPUModel detection on Darwin and x86/FreeBSD.
|
|
Thanks to Robin Scher for providing ways to implement this.
|
|
+ The x86 backend now adds CPUModel info attributes to socket objects
|
|
created by other backends that do not natively support this attribute.
|
|
+ Fix detection on FreeBSD in case of cpuset restriction. Thanks to
|
|
Sebastian Kuzminsky for reporting the problem.
|
|
* XML
|
|
+ Add hwloc_topology_set_userdata_import/export_callback(),
|
|
hwloc_export_obj_userdata() and _userdata_base64() to let
|
|
applications specify how to save/restore the custom data they placed
|
|
in the userdata private pointer field of hwloc objects.
|
|
* Tools
|
|
+ Add hwloc-annotate program to add string info attributes to XML
|
|
topologies.
|
|
+ Add --pid-cmd to hwloc-ps to append the output of a command to each
|
|
PID line. May be used for showing Open MPI process ranks, see the
|
|
hwloc-ps(1) manpage for details.
|
|
+ hwloc-bind now exits with an error if binding fails; the executable
|
|
is not launched unless binding suceeeded or --force was given.
|
|
+ Add --quiet to hwloc-calc and hwloc-bind to hide non-fatal error
|
|
messages.
|
|
+ Fix command-line pid support in windows tools.
|
|
+ All programs accept --verbose as a synonym to -v.
|
|
* Misc
|
|
+ Fix some DIR descriptor leaks on Linux.
|
|
+ Fix I/O device lists when some were filtered out after a XML import.
|
|
+ Fix the removal of I/O objects when importing a I/O-enabled XML topology
|
|
without any I/O topology flag.
|
|
+ When merging objects with HWLOC_IGNORE_TYPE_KEEP_STRUCTURE or
|
|
lstopo --merge, compare object types before deciding which one of two
|
|
identical object to remove (e.g. keep sockets in favor of caches).
|
|
+ Add some GUID- and LID-related info attributes to OpenFabrics
|
|
OS devices.
|
|
+ Only add CPUType socket attributes on Solaris/Sparc. Other cases
|
|
don't report reliable information (Solaris/x86), and a replacement
|
|
is available as the Architecture string info in the Machine object.
|
|
+ Add missing Backend string info on Solaris in most cases.
|
|
+ Document object attributes and string infos in a new Attributes
|
|
section in the documentation.
|
|
+ Add a section about Synthetic topologies in the documentation.
|
|
|
|
|
|
Version 1.5.2 (some of these changes are in 1.6.2 but not in 1.6)
|
|
-------------
|
|
* Use libpciaccess instead of pciutils/libpci by default for I/O discovery.
|
|
pciutils/libpci is only used if --enable-libpci is given to configure
|
|
because its GPL license may taint hwloc. See the Installation section
|
|
in the documentation for details.
|
|
* Fix get_cpubind on Solaris when bound to a single PU with
|
|
processor_bind(). Thanks to Eugene Loh for reporting the problem
|
|
and providing a patch.
|
|
* Fix some DIR descriptor leaks on Linux.
|
|
* Fix I/O device lists when some were filtered out after a XML import.
|
|
* Add missing Backend string info on Solaris in most cases.
|
|
* Fix the removal of I/O objects when importing a I/O-enabled XML topology
|
|
without any I/O topology flag.
|
|
* Fix the output of hwloc-calc -H --hierarchical when using logical
|
|
indexes in the output.
|
|
* Fix the pkg-config output with --libs --static.
|
|
Thanks to Erik Schnetter for reporting one of the problems.
|
|
|
|
|
|
Version 1.5.1
|
|
-------------
|
|
* Fix block OS device detection on Linux kernel 3.3 and later.
|
|
Thanks to Guy Streeter for reporting the problem and testing the fix.
|
|
* Fix the cpuid code in the x86 backend (for FreeBSD). Thanks to
|
|
Sebastian Kuzminsky for reporting problems and testing patches.
|
|
* Fix 64bit detection on FreeBSD.
|
|
* Fix some corner cases in the management of the thissystem flag with
|
|
respect to topology flags and environment variables.
|
|
* Fix some corner cases in command-line parsing checks in hwloc-distrib
|
|
and hwloc-distances.
|
|
* Make sure we do not miss some block OS devices on old Linux kernels
|
|
when a single PCI device has multiple IDE hosts/devices behind it.
|
|
* Do not disable I/O devices or instruction caches in hwloc-assembler output.
|
|
|
|
|
|
Version 1.5.0
|
|
-------------
|
|
* Backends
|
|
+ Do not limit the number of processors to 1024 on Solaris anymore.
|
|
+ Gather total machine memory on FreeBSD. Thanks to Cyril Roelandt.
|
|
+ XML topology files do not depend on the locale anymore. Float numbers
|
|
such as NUMA distances or PCI link speeds now always use a dot as a
|
|
decimal separator.
|
|
+ Add instruction caches detection on Linux, AIX, Windows and Darwin.
|
|
+ Add get_last_cpu_location() support for the current thread on AIX.
|
|
+ Support binding on AIX when threads or processes were bound with
|
|
bindprocessor(). Thanks to Hendryk Bockelmann for reporting the issue
|
|
and testing patches, and to Farid Parpia for explaining the binding
|
|
interfaces.
|
|
+ Improve AMD topology detection in the x86 backend (for FreeBSD) using
|
|
the topoext feature.
|
|
* API
|
|
+ Increase HWLOC_API_VERSION to 0x00010500 so that API changes may be
|
|
detected at build-time.
|
|
+ Add a cache type attribute describind Data, Instruction and Unified
|
|
caches. Caches with different types but same depth (for instance L1d
|
|
and L1i) are placed on different levels.
|
|
+ Add hwloc_get_cache_type_depth() to retrieve the hwloc level depth of
|
|
of the given cache depth and type, for instance L1i or L2.
|
|
It helps disambiguating the case where hwloc_get_type_depth() returns
|
|
HWLOC_TYPE_DEPTH_MULTIPLE.
|
|
+ Instruction caches are ignored unless HWLOC_TOPOLOGY_FLAG_ICACHES is
|
|
passed to hwloc_topology_set_flags() before load.
|
|
+ Add hwloc_ibv_get_device_osdev_by_name() OpenFabrics helper in
|
|
openfabrics-verbs.h to find the hwloc OS device object corresponding to
|
|
an OpenFabrics device.
|
|
* Tools
|
|
+ Add lstopo-no-graphics, a lstopo built without graphical support to
|
|
avoid dependencies on external libraries such as Cairo and X11. When
|
|
supported, graphical outputs are only available in the original lstopo
|
|
program.
|
|
- Packagers splitting lstopo and lstopo-no-graphics into different
|
|
packages are advised to use the alternatives system so that lstopo
|
|
points to the best available binary.
|
|
+ Instruction caches are enabled in lstopo by default. Use --no-icaches
|
|
to disable them.
|
|
+ Add -t/--threads to show threads in hwloc-ps.
|
|
* Removal of obsolete components
|
|
+ Remove the old cpuset interface (hwloc/cpuset.h) which is deprecated and
|
|
superseded by the bitmap API (hwloc/bitmap.h) since v1.1.
|
|
hwloc_cpuset and nodeset types are still defined, but all hwloc_cpuset_*
|
|
compatibility wrappers are now gone.
|
|
+ Remove Linux libnuma conversion helpers for the deprecated and
|
|
broken nodemask_t interface.
|
|
+ Remove support for "Proc" type name, it was superseded by "PU" in v1.0.
|
|
+ Remove hwloc-mask symlinks, it was replaced by hwloc-calc in v1.0.
|
|
* Misc
|
|
+ Fix PCIe 3.0 link speed computation.
|
|
+ Non-printable characters are dropped from strings during XML export.
|
|
+ Fix importing of escaped characters with the minimalistic XML backend.
|
|
+ Assert hwloc_is_thissystem() in several I/O related helpers.
|
|
+ Fix some memory leaks in the x86 backend for FreeBSD.
|
|
+ Minor fixes to ease native builds on Windows.
|
|
+ Limit the number of retries when operating on all threads within a
|
|
process on Linux if the list of threads is heavily getting modified.
|
|
|
|
|
|
Version 1.4.3
|
|
-------------
|
|
* This release is only meant to fix the pciutils license issue when upgrading
|
|
to hwloc v1.5 or later is not possible. It contains several other minor
|
|
fixes but ignores many of them that are only in v1.5 or later.
|
|
* Use libpciaccess instead of pciutils/libpci by default for I/O discovery.
|
|
pciutils/libpci is only used if --enable-libpci is given to configure
|
|
because its GPL license may taint hwloc. See the Installation section
|
|
in the documentation for details.
|
|
* Fix PCIe 3.0 link speed computation.
|
|
* Fix importing of escaped characters with the minimalistic XML backend.
|
|
* Fix a memory leak in the x86 backend.
|
|
|
|
|
|
Version 1.4.2
|
|
-------------
|
|
* Fix build on Solaris 9 and earlier when fabsf() is not a compiler
|
|
built-in. Thanks to Igor Galić for reporting the problem.
|
|
* Fix support for more than 32 processors on Windows. Thanks to Hartmut
|
|
Kaiser for reporting the problem.
|
|
* Fix process-wide binding and cpulocation routines on Linux when some
|
|
threads disappear in the meantime. Thanks to Vlad Roubtsov for reporting
|
|
the issue.
|
|
* Make installed scripts executable. Thanks to Jirka Hladky for reporting
|
|
the problem.
|
|
* Fix libtool revision management when building for Windows. This fix was
|
|
also released as hwloc v1.4.1.1 Windows builds. Thanks to Hartmut Kaiser
|
|
for reporting the problem.
|
|
* Fix the __hwloc_inline keyword in public headers when compiling with a
|
|
C++ compiler.
|
|
* Add Port info attribute to network OS devices inside OpenFabrics PCI
|
|
devices so as to identify which interface corresponds to which port.
|
|
* Document requirements for interoperability helpers: I/O devices discovery
|
|
is required for some of them; the topology must match the current host
|
|
for most of them.
|
|
|
|
|
|
Version 1.4.1 (contains all 1.3.2 changes)
|
|
-------------
|
|
* Fix hwloc_alloc_membind, thanks Karl Napf for reporting the issue.
|
|
* Fix memory leaks in some get_membind() functions.
|
|
* Fix helpers converting from Linux libnuma to hwloc (hwloc/linux-libnuma.h)
|
|
in case of out-of-order NUMA node ids.
|
|
* Fix some overzealous assertions in the distance grouping code.
|
|
* Workaround BIOS reporting empty I/O locality in CUDA and OpenFabrics
|
|
helpers on Linux. Thanks to Albert Solernou for reporting the problem.
|
|
* Install a valgrind suppressions file hwloc-valgrind.supp (see the FAQ).
|
|
* Fix memory binding documentation. Thanks to Karl Napf for reporting the
|
|
issues.
|
|
|
|
|
|
Version 1.4.0 (does not contain all 1.3.2 changes)
|
|
-------------
|
|
* Major features
|
|
+ Add "custom" interface and "assembler" tools to build multi-node
|
|
topology. See the Multi-node Topologies section in the documentation
|
|
for details.
|
|
* Interface improvements
|
|
+ Add symmetric_subtree object attribute to ease assumptions when consulting
|
|
regular symmetric topologies.
|
|
+ Add a CPUModel and CPUType info attribute to Socket objects on Linux
|
|
and Solaris.
|
|
+ Add hwloc_get_obj_index_inside_cpuset() to retrieve the "logical" index
|
|
of an object within a subtree of the topology.
|
|
+ Add more NVIDIA CUDA helpers in cuda.h and cudart.h to find hwloc objects
|
|
corresponding to CUDA devices.
|
|
* Discovery improvements
|
|
+ Add a group object above partial distance matrices to make sure
|
|
the matrices are available in the final topology, except when this
|
|
new object would contradict the existing hierarchy.
|
|
+ Grouping by distances now also works when loading from XML.
|
|
+ Fix some corner cases in object insertion, for instance when dealing
|
|
with NUMA nodes without any CPU.
|
|
* Backends
|
|
+ Implement hwloc_get_area_membind() on Linux.
|
|
+ Honor I/O topology flags when importing from XML.
|
|
+ Further improve XML-related error checking and reporting.
|
|
+ Hide synthetic topology error messages unless HWLOC_SYNTHETIC_VERBOSE=1.
|
|
* Tools
|
|
+ Add synthetic exporting of symmetric topologies to lstopo.
|
|
+ lstopo --horiz and --vert can now be applied to some specific object types.
|
|
+ lstopo -v -p now displays distance matrices with physical indexes.
|
|
+ Add hwloc-distances utility to list distances.
|
|
* Documentation
|
|
+ Fix and/or document the behavior of most inline functions in hwloc/helper.h
|
|
when the topology contains some I/O or Misc objects.
|
|
+ Backend documentation enhancements.
|
|
* Bug fixes
|
|
+ Fix missing last bit in hwloc_linux_get_thread_cpubind().
|
|
Thanks to Carolina Gómez-Tostón Gutiérrez for reporting the issue.
|
|
+ Fix FreeBSD build without cpuid support.
|
|
+ Fix several Windows build issues.
|
|
+ Fix inline keyword definition in public headers.
|
|
+ Fix dependencies in the embedded library.
|
|
+ Improve visibility support detection. Thanks to Dave Love for providing
|
|
the patch.
|
|
+ Remove references to internal symbols in the tools.
|
|
|
|
|
|
Version 1.3.3
|
|
-------------
|
|
* This release is only meant to fix the pciutils license issue when upgrading
|
|
to hwloc v1.4 or later is not possible. It contains several other minor
|
|
fixes but ignores many of them that are only in v1.4 or later.
|
|
* Use libpciaccess instead of pciutils/libpci by default for I/O discovery.
|
|
pciutils/libpci is only used if --enable-libpci is given to configure
|
|
because its GPL license may taint hwloc. See the Installation section
|
|
in the documentation for details.
|
|
|
|
|
|
Version 1.3.2
|
|
-------------
|
|
* Fix missing last bit in hwloc_linux_get_thread_cpubind().
|
|
Thanks to Carolina Gómez-Tostón Gutiérrez for reporting the issue.
|
|
* Fix build with -mcmodel=medium. Thanks to Devendar Bureddy for reporting
|
|
the issue.
|
|
* Fix build with Solaris Studio 12 compiler when XML is disabled.
|
|
Thanks to Paul H. Hargrove for reporting the problem.
|
|
* Fix installation with old GNU sed, for instance on Red Hat 8.
|
|
Thanks to Paul H. Hargrove for reporting the problem.
|
|
* Fix PCI locality when Linux cgroups restrict the available CPUs.
|
|
* Fix floating point issue when grouping by distance on mips64 architecture.
|
|
Thanks to Paul H. Hargrove for reporting the problem.
|
|
* Fix conversion from/to Linux libnuma when some NUMA nodes have no memory.
|
|
* Fix support for gccfss compilers with broken ffs() support. Thanks to
|
|
Paul H. Hargrove for reporting the problem and providing a patch.
|
|
* Fix FreeBSD build without cpuid support.
|
|
* Fix several Windows build issues.
|
|
* Fix inline keyword definition in public headers.
|
|
* Fix dependencies in the embedded library.
|
|
* Detect when a compiler such as xlc may not report compile errors
|
|
properly, causing some configure checks to be wrong. Thanks to
|
|
Paul H. Hargrove for reporting the problem and providing a patch.
|
|
* Improve visibility support detection. Thanks to Dave Love for providing
|
|
the patch.
|
|
* Remove references to internal symbols in the tools.
|
|
* Fix installation on systems with limited command-line size.
|
|
Thanks to Paul H. Hargrove for reporting the problem.
|
|
* Further improve XML-related error checking and reporting.
|
|
|
|
|
|
Version 1.3.1
|
|
-------------
|
|
* Fix pciutils detection with pkg-config when not installed in standard
|
|
directories.
|
|
* Fix visibility options detection with the Solaris Studio compiler.
|
|
Thanks to Igor Galić and Terry Dontje for reporting the problems.
|
|
* Fix support for old Linux sched.h headers such as those found
|
|
on Red Hat 8. Thanks to Paul H. Hargrove for reporting the problems.
|
|
* Fix inline and attribute support for Solaris compilers. Thanks to
|
|
Dave Love for reporting the problems.
|
|
* Print a short summary at the end of the configure output. Thanks to
|
|
Stefan Eilemann for the suggestion.
|
|
* Add --disable-libnuma configure option to disable libnuma-based
|
|
memory binding support on Linux. Thanks to Rayson Ho for the
|
|
suggestion.
|
|
* Make hwloc's configure script properly obey $PKG_CONFIG. Thanks to
|
|
Nathan Phillip Brink for raising the issue.
|
|
* Silence some harmless pciutils warnings, thanks to Paul H. Hargrove
|
|
for reporting the problem.
|
|
* Fix the documentation with respect to hwloc_pid_t and hwloc_thread_t
|
|
being either pid_t and pthread_t on Unix, or HANDLE on Windows.
|
|
|
|
|
|
Version 1.3.0
|
|
-------------
|
|
* Major features
|
|
+ Add I/O devices and bridges to the topology using the pciutils
|
|
library. Only enabled after setting the relevant flag with
|
|
hwloc_topology_set_flags() before hwloc_topology_load(). See the
|
|
I/O Devices section in the documentation for details.
|
|
* Discovery improvements
|
|
+ Add associativity to the cache attributes.
|
|
+ Add support for s390/z11 "books" on Linux.
|
|
+ Add the HWLOC_GROUPING_ACCURACY environment variable to relax
|
|
distance-based grouping constraints. See the Environment Variables
|
|
section in the documentation for details about grouping behavior
|
|
and configuration.
|
|
+ Allow user-given distance matrices to remove or replace those
|
|
discovered by the OS backend.
|
|
* XML improvements
|
|
+ XML is now always supported: a minimalistic custom import/export
|
|
code is used when libxml2 is not available. It is only guaranteed
|
|
to read XML files generated by hwloc.
|
|
+ hwloc_topology_export_xml() and export_xmlbuffer() now return an
|
|
integer.
|
|
+ Add hwloc_free_xmlbuffer() to free the buffer allocated by
|
|
hwloc_topology_export_xmlbuffer().
|
|
+ Hide XML topology error messages unless HWLOC_XML_VERBOSE=1.
|
|
* Minor API updates
|
|
+ Add hwloc_obj_add_info to customize object info attributes.
|
|
* Tools
|
|
+ lstopo now displays I/O devices by default. Several options are
|
|
added to configure the I/O discovery.
|
|
+ hwloc-calc and hwloc-bind now accept I/O devices as input.
|
|
+ Add --restrict option to hwloc-calc and hwloc-distribute.
|
|
+ Add --sep option to change the output field separator in hwloc-calc.
|
|
+ Add --whole-system option to hwloc-ps.
|
|
|
|
|
|
Version 1.2.2
|
|
-------------
|
|
* Fix build on AIX 5.2, thanks Utpal Kumar Ray for the report.
|
|
* Fix XML import of very large page sizes or counts on 32bits platform,
|
|
thanks to Karsten Hopp for the RedHat ticket.
|
|
* Fix crash when administrator limitations such as Linux cgroup require
|
|
to restrict distance matrices. Thanks to Ake Sandgren for reporting the
|
|
problem.
|
|
* Fix the removal of objects such as AMD Magny-Cours dual-node sockets
|
|
in case of administrator restrictions.
|
|
* Improve error reporting and messages in case of wrong synthetic topology
|
|
description.
|
|
* Several other minor internal fixes and documentation improvements.
|
|
|
|
|
|
Version 1.2.1
|
|
-------------
|
|
* Improve support of AMD Bulldozer "Compute-Unit" modules by detecting
|
|
logical processors with different core IDs on Linux.
|
|
* Fix hwloc-ps crash when listing processes from another Linux cpuset.
|
|
Thanks to Carl Smith for reporting the problem.
|
|
* Fix build on AIX and Solaris. Thanks to Carl Smith and Andreas Kupries
|
|
for reporting the problems.
|
|
* Fix cache size detection on Darwin. Thanks to Erkcan Özcan for reporting
|
|
the problem.
|
|
* Make configure fail if --enable-xml or --enable-cairo is given and
|
|
proper support cannot be found. Thanks to Andreas Kupries for reporting
|
|
the XML problem.
|
|
* Fix spurious L1 cache detection on AIX. Thanks to Hendryk Bockelmann
|
|
for reporting the problem.
|
|
* Fix hwloc_get_last_cpu_location(THREAD) on Linux. Thanks to Gabriele
|
|
Fatigati for reporting the problem.
|
|
* Fix object distance detection on Solaris.
|
|
* Add pthread_self weak symbol to ease static linking.
|
|
* Minor documentation fixes.
|
|
|
|
|
|
Version 1.2.0
|
|
-------------
|
|
* Major features
|
|
+ Expose latency matrices in the API as an array of distance structures
|
|
within objects. Add several helpers to find distances.
|
|
+ Add hwloc_topology_set_distance_matrix() and environment variables
|
|
to provide a matrix of distances between a given set of objects.
|
|
+ Add hwloc_get_last_cpu_location() and hwloc_get_proc_last_cpu_location()
|
|
to retrieve the processors where a process or thread recently ran.
|
|
- Add the corresponding --get-last-cpu-location option to hwloc-bind.
|
|
+ Add hwloc_topology_restrict() to restrict an existing topology to a
|
|
given cpuset.
|
|
- Add the corresponding --restrict option to lstopo.
|
|
* Minor API updates
|
|
+ Add hwloc_bitmap_list_sscanf/snprintf/asprintf to convert between bitmaps
|
|
and strings such as 4-5,7-9,12,15-
|
|
+ hwloc_bitmap_set/clr_range() now support infinite ranges.
|
|
+ Clarify the difference between inserting Misc objects by cpuset or by
|
|
parent.
|
|
+ hwloc_insert_misc_object_by_cpuset() now returns NULL in case of error.
|
|
* Discovery improvements
|
|
+ x86 backend (for freebsd): add x2APIC support
|
|
+ Support standard device-tree phandle, to get better support on e.g. ARM
|
|
systems providing it.
|
|
+ Detect cache size on AIX. Thanks Christopher and IBM.
|
|
+ Improve grouping to support asymmetric topologies.
|
|
* Tools
|
|
+ Command-line tools now support "all" and "root" special locations
|
|
consisting in the entire topology, as well as type names with depth
|
|
attributes such as L2 or Group4.
|
|
+ hwloc-calc improvements:
|
|
- Add --number-of/-N option to report the number of objects of a given
|
|
type or depth.
|
|
- -I is now equivalent to --intersect for listing the indexes of
|
|
objects of a given type or depth that intersects the input.
|
|
- Add -H to report the output as a hierarchical combination of types
|
|
and depths.
|
|
+ Add --thissystem to lstopo.
|
|
+ Add lstopo-win, a console-less lstopo variant on Windows.
|
|
* Miscellaneous
|
|
+ Remove C99 usage from code base.
|
|
+ Rename hwloc-gather-topology.sh into hwloc-gather-topology
|
|
+ Fix AMD cache discovery on freebsd when there is no L3 cache, thanks
|
|
Andriy Gapon for the fix.
|
|
|
|
|
|
Version 1.1.2
|
|
-------------
|
|
* Fix a segfault in the distance-based grouping code when some objects
|
|
are not placed in any group. Thanks to Bernd Kallies for reporting
|
|
the problem and providing a patch.
|
|
* Fix the command-line parsing of hwloc-bind --mempolicy interleave.
|
|
Thanks to Guy Streeter for reporting the problem.
|
|
* Stop truncating the output in hwloc_obj_attr_snprintf() and in the
|
|
corresponding lstopo output. Thanks to Guy Streeter for reporting the
|
|
problem.
|
|
* Fix object levels ordering in synthetic topologies.
|
|
* Fix potential incoherency between device tree and kernel information,
|
|
when SMT is disabled on Power machines.
|
|
* Fix and document the behavior of hwloc_topology_set_synthetic() in case
|
|
of invalid argument. Thanks to Guy Streeter for reporting the problem.
|
|
* Add some verbose error message reporting when it looks like the OS
|
|
gives erroneous information.
|
|
* Do not include unistd.h and stdint.h in public headers on Windows.
|
|
* Move config.h files into their own subdirectories to avoid name
|
|
conflicts when AC_CONFIG_HEADERS adds -I's for them.
|
|
* Remove the use of declaring variables inside "for" loops.
|
|
* Some other minor fixes.
|
|
* Many minor documentation fixes.
|
|
|
|
|
|
Version 1.1.1
|
|
-------------
|
|
* Add hwloc_get_api_version() which returns the version of hwloc used
|
|
at runtime. Thanks to Guy Streeter for the suggestion.
|
|
* Fix the number of hugepages reported for NUMA nodes on Linux.
|
|
* Fix hwloc_bitmap_to_ulong() right after allocating the bitmap.
|
|
Thanks to Bernd Kallies for reporting the problem.
|
|
* Fix hwloc_bitmap_from_ith_ulong() to properly zero the first ulong.
|
|
Thanks to Guy Streeter for reporting the problem.
|
|
* Fix hwloc_get_membind_nodeset() on Linux.
|
|
Thanks to Bernd Kallies for reporting the problem and providing a patch.
|
|
* Fix some file descriptor leaks in the Linux discovery.
|
|
* Fix the minimum width of NUMA nodes, caches and the legend in the graphical
|
|
lstopo output. Thanks to Jirka Hladky for reporting the problem.
|
|
* Various fixes to bitmap conversion from/to taskset-strings.
|
|
* Fix and document snprintf functions behavior when the buffer size is too
|
|
small or zero. Thanks to Guy Streeter for reporting the problem.
|
|
* Fix configure to avoid spurious enabling of the cpuid backend.
|
|
Thanks to Tim Anderson for reporting the problem.
|
|
* Cleanup error management in hwloc-gather-topology.sh.
|
|
Thanks to Jirka Hladky for reporting the problem and providing a patch.
|
|
* Add a manpage and usage for hwloc-gather-topology.sh on Linux.
|
|
Thanks to Jirka Hladky for providing a patch.
|
|
* Memory binding documentation enhancements.
|
|
|
|
|
|
Version 1.1.0
|
|
-------------
|
|
|
|
* API
|
|
+ Increase HWLOC_API_VERSION to 0x00010100 so that API changes may be
|
|
detected at build-time.
|
|
+ Add a memory binding interface.
|
|
+ The cpuset API (hwloc/cpuset.h) is now deprecated. It is replaced by
|
|
the bitmap API (hwloc/bitmap.h) which offers the same features with more
|
|
generic names since it applies to CPU sets, node sets and more.
|
|
Backward compatibility with the cpuset API and ABI is still provided but
|
|
it will be removed in a future release.
|
|
Old types (hwloc_cpuset_t, ...) are still available as a way to clarify
|
|
what kind of hwloc_bitmap_t each API function manipulates.
|
|
Upgrading to the new API only requires to replace hwloc_cpuset_ function
|
|
calls with the corresponding hwloc_bitmap_ calls, with the following
|
|
renaming exceptions:
|
|
- hwloc_cpuset_cpu -> hwloc_bitmap_only
|
|
- hwloc_cpuset_all_but_cpu -> hwloc_bitmap_allbut
|
|
- hwloc_cpuset_from_string -> hwloc_bitmap_sscanf
|
|
+ Add an `infos' array in each object to store couples of info names and
|
|
values. It enables generic storage of things like the old dmi board infos
|
|
that were previously stored in machine specific attributes.
|
|
+ Add linesize cache attribute.
|
|
* Features
|
|
+ Bitmaps (and thus CPU sets and node sets) are dynamically (re-)allocated,
|
|
the maximal number of CPUs (HWLOC_NBMAXCPUS) has been removed.
|
|
+ Improve the distance-based grouping code to better support irregular
|
|
distance matrices.
|
|
+ Add support for device-tree to get cache information (useful on Power
|
|
architectures).
|
|
* Helpers
|
|
+ Add NVIDIA CUDA helpers in cuda.h and cudart.h to ease interoperability
|
|
with CUDA Runtime and Driver APIs.
|
|
+ Add Myrinet Express helper in myriexpress.h to ease interoperability.
|
|
* Tools
|
|
+ lstopo now displays physical/OS indexes by default in graphical mode
|
|
(use -l to switch back to logical indexes). The textual output still uses
|
|
logical by default (use -p to switch to physical indexes).
|
|
+ lstopo prefixes logical indexes with `L#' and physical indexes with `P#'.
|
|
Physical indexes are also printed as `P#N' instead of `phys=N' within
|
|
object attributes (in parentheses).
|
|
+ Add a legend at the bottom of the lstopo graphical output, use --no-legend
|
|
to remove it.
|
|
+ Add hwloc-ps to list process' bindings.
|
|
+ Add --membind and --mempolicy options to hwloc-bind.
|
|
+ Improve tools command-line options by adding a generic --input option
|
|
(and more) which replaces the old --xml, --synthetic and --fsys-root.
|
|
+ Cleanup lstopo output configuration by adding --output-format.
|
|
+ Add --intersect in hwloc-calc, and replace --objects with --largest.
|
|
+ Add the ability to work on standard input in hwloc-calc.
|
|
+ Add --from, --to and --at in hwloc-distrib.
|
|
+ Add taskset-specific functions and command-line tools options to
|
|
manipulate CPU set strings in the format of the taskset program.
|
|
+ Install hwloc-gather-topology.sh on Linux.
|
|
|
|
|
|
Version 1.0.3
|
|
-------------
|
|
|
|
* Fix support for Linux cpuset when emulated by a cgroup mount point.
|
|
* Remove unneeded runtime dependency on libibverbs.so in the library and
|
|
all utils programs.
|
|
* Fix hwloc_cpuset_to_linux_libnuma_ulongs in case of non-linear OS-indexes
|
|
for NUMA nodes.
|
|
* lstopo now displays physical/OS indexes by default in graphical mode
|
|
(use -l to switch back to logical indexes). The textual output still uses
|
|
logical by default (use -p to switch to physical indexes).
|
|
|
|
|
|
Version 1.0.2
|
|
-------------
|
|
|
|
* Public headers can now be included directly from C++ programs.
|
|
* Solaris fix for non-contiguous cpu numbers. Thanks to Rolf vandeVaart for
|
|
reporting the issue.
|
|
* Darwin 10.4 fix. Thanks to Olivier Cessenat for reporting the issue.
|
|
* Revert 1.0.1 patch that ignored sockets with unknown ID values since it
|
|
only slightly helped POWER7 machines with old Linux kernels while it
|
|
prevents recent kernels from getting the complete POWER7 topology.
|
|
* Fix hwloc_get_common_ancestor_obj().
|
|
* Remove arch-specific bits in public headers.
|
|
* Some fixes in the lstopo graphical output.
|
|
* Various man page clarifications and minor updates.
|
|
|
|
|
|
Version 1.0.1
|
|
-------------
|
|
|
|
* Various Solaris fixes. Thanks to Yannick Martin for reporting the issue.
|
|
* Fix "non-native" builds on x86 platforms (e.g., when building 32
|
|
bit executables with compilers that natively build 64 bit).
|
|
* Ignore sockets with unknown ID values (which fixes issues on POWER7
|
|
machines). Thanks to Greg Bauer for reporting the issue.
|
|
* Various man page clarifications and minor updates.
|
|
* Fixed memory leaks in hwloc_setup_group_from_min_distance_clique().
|
|
* Fix cache type filtering on MS Windows 7. Thanks to Αλέξανδρος
|
|
Παπαδογιαννάκ for reporting the issue.
|
|
* Fixed warnings when compiling with -DNDEBUG.
|
|
|
|
|
|
Version 1.0.0
|
|
-------------
|
|
|
|
* The ABI of the library has changed.
|
|
* Backend updates
|
|
+ Add FreeBSD support.
|
|
+ Add x86 cpuid based backend.
|
|
+ Add Linux cgroup support to the Linux cpuset code.
|
|
+ Support binding of entire multithreaded process on Linux.
|
|
+ Fix and enable Group support in Windows.
|
|
+ Cleanup XML export/import.
|
|
* Objects
|
|
+ HWLOC_OBJ_PROC is renamed into HWLOC_OBJ_PU for "Processing Unit",
|
|
its stringified type name is now "PU".
|
|
+ Use new HWLOC_OBJ_GROUP objects instead of MISC when grouping
|
|
objects according to NUMA distances or arbitrary OS aggregation.
|
|
+ Rework memory attributes.
|
|
+ Add different cpusets in each object to specify processors that
|
|
are offline, unavailable, ...
|
|
+ Cleanup the storage of object names and DMI infos.
|
|
* Features
|
|
+ Add support for looking up specific PID topology information.
|
|
+ Add hwloc_topology_export_xml() to export the topology in a XML file.
|
|
+ Add hwloc_topology_get_support() to retrieve the supported features
|
|
for the current topology context.
|
|
+ Support non-SYSTEM object as the root of the tree, use MACHINE in
|
|
most common cases.
|
|
+ Add hwloc_get_*cpubind() routines to retrieve the current binding
|
|
of processes and threads.
|
|
* API
|
|
+ Add HWLOC_API_VERSION to help detect the currently used API version.
|
|
+ Add missing ending "e" to *compare* functions.
|
|
+ Add several routines to emulate PLPA functions.
|
|
+ Rename and rework the cpuset and/or/xor/not/clear operators to output
|
|
their result in a dedicated argument instead of modifying one input.
|
|
+ Deprecate hwloc_obj_snprintf() in favor of hwloc_obj_type/attr_snprintf().
|
|
+ Clarify the use of parent and ancestor in the API, do not use father.
|
|
+ Replace hwloc_get_system_obj() with hwloc_get_root_obj().
|
|
+ Return -1 instead of HWLOC_OBJ_TYPE_MAX in the API since the latter
|
|
isn't public.
|
|
+ Relax constraints in hwloc_obj_type_of_string().
|
|
+ Improve displaying of memory sizes.
|
|
+ Add 0x prefix to cpuset strings.
|
|
* Tools
|
|
+ lstopo now displays logical indexes by default, use --physical to
|
|
revert back to OS/physical indexes.
|
|
+ Add colors in the lstopo graphical outputs to distinguish between online,
|
|
offline, reserved, ... objects.
|
|
+ Extend lstopo to show cpusets, filter objects by type, ...
|
|
+ Renamed hwloc-mask into hwloc-calc which supports many new options.
|
|
* Documentation
|
|
+ Add a hwloc(7) manpage containing general information.
|
|
+ Add documentation about how to switch from PLPA to hwloc.
|
|
+ Cleanup the distributed documentation files.
|
|
* Miscellaneous
|
|
+ Many compilers warning fixes.
|
|
+ Cleanup the ABI by using the visibility attribute.
|
|
+ Add project embedding support.
|
|
|
|
|
|
Version 0.9.4 (unreleased)
|
|
-------------
|
|
|
|
* Fix reseting colors to normal in lstopo -.txt output.
|
|
* Fix Linux pthread_t binding error report.
|
|
|
|
|
|
Version 0.9.3
|
|
-------------
|
|
|
|
* Fix autogen.sh to work with Autoconf 2.63.
|
|
* Fix various crashes in particular conditions:
|
|
- xml files with root attributes
|
|
- offline CPUs
|
|
- partial sysfs support
|
|
- unparseable /proc/cpuinfo
|
|
- ignoring NUMA level while Misc level have been generated
|
|
* Tweak documentation a bit
|
|
* Do not require the pthread library for binding the current thread on Linux
|
|
* Do not erroneously consider the sched_setaffinity prototype is the old version
|
|
when there is actually none.
|
|
* Fix _syscall3 compilation on archs for which we do not have the
|
|
sched_setaffinity system call number.
|
|
* Fix AIX binding.
|
|
* Fix libraries dependencies: now only lstopo depends on libtermcap, fix
|
|
binutils-gold link
|
|
* Have make check always build and run hwloc-hello.c
|
|
* Do not limit size of a cpuset.
|
|
|
|
|
|
Version 0.9.2
|
|
-------------
|
|
|
|
* Trivial documentation changes.
|
|
|
|
|
|
Version 0.9.1
|
|
-------------
|
|
|
|
* Re-branded to "hwloc" and moved to the Open MPI project, relicensed under the
|
|
BSD license.
|
|
* The prefix of all functions and tools is now hwloc, and some public
|
|
functions were also renamed for real.
|
|
* Group NUMA nodes into Misc objects according to their physical distance
|
|
that may be reported by the OS/BIOS.
|
|
May be ignored by setting HWLOC_IGNORE_DISTANCES=1 in the environment.
|
|
* Ignore offline CPUs on Solaris.
|
|
* Improved binding support on AIX.
|
|
* Add HP-UX support.
|
|
* CPU sets are now allocated/freed dynamically.
|
|
* Add command line options to tune the lstopo graphical output, add
|
|
semi-graphical textual output
|
|
* Extend topobind to support multiple cpusets or objects on the command
|
|
line as topomask does.
|
|
* Add an Infiniband-specific helper hwloc/openfabrics-verbs.h to retrieve
|
|
the physical location of IB devices.
|
|
|
|
|
|
Version 0.9 (formerly named "libtopology")
|
|
-----------
|
|
|
|
* First release.
|