All of the interesting technological, artistic or just plain fun subjects I'd investigate if I had an infinite number of lifetimes. In other words, a dumping ground...

Tuesday 6 November 2007

Regular expressions with C on Linux

Have to try this out and see how well it works...
man regex

REGCOMP(3)
Linux Programmers Manual
REGCOMP(3)

NAME
regcomp, regexec, regerror, regfree - POSIX regex functions

SYNOPSIS
#include <sys/types.h>
#include <regex.h>

int regcomp(regex_t *preg, const char *regex, int cflags);
int regexec(const regex_t *preg, const char *string, size_t nmatch,
regmatch_t pmatch[], int eflags);
size_t regerror(int errcode, const regex_t *preg, char *errbuf,
size_t errbuf_size);
void regfree(regex_t *preg);

POSIX REGEX COMPILING
regcomp is used to compile a regular expression into a form that is
suitable for subsequent regexec searches.

regcomp is supplied with preg, a pointer to a pattern buffer storage
area; regex, a pointer to the null-terminated string and cflags, flags used
to determine the type of compilation.

All regular expression searching must be done via a compiled pattern
buffer, thus regexec must always be supplied with the address of a regcomp
initialized pattern buffer.

cflags may be the bitwise-or of one or more of the following:

REG_EXTENDED
Use POSIX Extended Regular Expression syntax when
interpreting regex. If not set, POSIX Basic Regular Expression syntax is
used.

REG_ICASE
Do not differentiate case. Subsequent regexec searches using
this pattern buffer will be case insensitive.

REG_NOSUB
Support for substring addressing of matches is not required.
The nmatch and pmatch parameters to regexec are ignored if the pattern
buffer supplied was compiled with this flag set.

REG_NEWLINE
Match-any-character operators donât match a newline.

A non-matching list ([^...]) not containing a newline does
not match a newline.

Match-beginning-of-line operator (^) matches the empty string
immediately after a newline, regardless of whether eflags, the execution
flags of regexec, contains REG_NOTBOL.

Match-end-of-line operator ($) matches the empty string
immediately before a newline, regardless of whether eflags contains
REG_NOTEOL.

No comments:

tim's shared items

Blog Archive

Add to Google Reader or Homepage