Home
Last updated
The regex
package is part of the Go standard library and implements regular expression search and pattern matching. The package implements the RE2 syntax and is the same general syntax used by Perl and Python. It can operate on strings or bytes.
To use a regular expression in Go it must first be parsed and returned as a Regexp
object.
r, err := regexp.Compile("^foo")
if err != nil {
log.Fatal(err)
}
When using the Compile
method if the regular expression is not parsed successfully an error is returned that needs to be handled. The MustCompile
method is similar but panics if that parsing of the regex fails.
// the compiler panics if this fails
r := regexp.MustCompile("^foo")
Once the regular experession has been parsed and returned it can be used with the methods provided by the regexp
package. The regexp
package supports
To find a string use the FindString
method. This returns the left most match of the regular expression. If nothing is found an empty string is returned.
package main
import (
"fmt"
"regexp"
)
func main() {
re := regexp.MustCompile("ck$")
fmt.Println(re.FindString("hack"))
fmt.Println(re.FindString("cricket"))
}
Running this returns the following. Run on Go Playground
ck
Program exited.
To find a string index use the FindStringIndex
method. This returns the left most match of the regular expression as a two-element slice. If nothing is found an nil value is returned.
package main
import (
"fmt"
"regexp"
)
func main() {
re := regexp.MustCompile("tel")
fmt.Println(re.FindStringIndex("telephone"))
fmt.Println(re.FindStringIndex("carpet"))
fmt.Println(re.FindStringIndex("cartel"))
}
Running this returns the following. Run on Go Playground
[0 3]
[]
[3 6]
Program exited.
To find a string submatch use the FindStringSubmatch
method. This returns the left most match of the regular expression and an of the submatches. If nothing is found an nil value is returned.
package main
import (
"fmt"
"regexp"
)
func main() {
re := regexp.MustCompile("h([a-z]+)king")
fmt.Println(re.FindStringSubmatch("hacking hiking"))
fmt.Println(re.FindStringSubmatch("licking"))
}
Running this returns the following. Run on Go Playground
[hacking ac]
[]
Program exited.
To find a string index use the FindStringSubmatchIndex
method. This returns a slice of the index positions of a match and any substring match. If nothing is found an nil value is returned.
package main
import (
"fmt"
"regexp"
)
func main() {
re := regexp.MustCompile("h([a-z]+)king")
fmt.Println(re.FindStringSubmatchIndex("hacking hiking"))
fmt.Println(re.FindStringSubmatchIndex("licking"))
}
Running this returns the following. Run on Go Playground
[0 7 1 3]
[]
Program exited.
To replace all occurrence of a string use the ReplaceAllString
method. This returns a copy of of the original string with replacements made. If not match is found the original string is returned.
package main
import (
"fmt"
"regexp"
)
func main() {
re := regexp.MustCompile("ise")
s := "ize"
fmt.Println(re.ReplaceAllString("realise", s))
fmt.Println(re.ReplaceAllString("organise", s))
fmt.Println(re.ReplaceAllString("analyse", s))
}
Running this returns the following. Run on Go Playground
realize
organize
analyse
Program exited.
In the following example the regexp
package is used to scrape the contents of h3
tags from a page. The raw html is parsed against a regex to find the h3
tags, then any HTML is stripped before being escaped and sent to standard output.
package main
import (
"fmt"
"html"
"io/ioutil"
"net/http"
"regexp"
)
func main() {
resp, err := http.Get("https://petition.parliament.uk/petitions")
if err != nil {
fmt.Println("http get error.")
}
defer resp.Body.Close()
body, err := ioutil.ReadAll(resp.Body)
if err != nil {
fmt.Println("http read error")
return
}
src := string(body)
r, _ := regexp.Compile("\\<h3\\>.*\\</h3\\>")
rHTML, _ := regexp.Compile("<[^>]*>")
titles := r.FindAllString(src, -1)
for _, title := range titles {
cleanTitle := rHTML.ReplaceAllString(title, "")
fmt.Println(html.UnescapeString(cleanTitle))
}
}
This returns the contents of all h3
tags on the page if you want to see what UK Citizens are petitioning their government about.
EU Referendum Rules triggering a 2nd EU Referendum
Give the Meningitis B vaccine to ALL children, not just newborn babies.
Block Donald J Trump from UK entry
...
Note that this can also be achieved with the html parser available in the Go supplementary network libraries.
package main
import (
"fmt"
"golang.org/x/net/html"
"net/http"
)
func main() {
resp, err := http.Get("https://petition.parliament.uk/petitions")
if err != nil {
fmt.Println("http get error.")
}
defer resp.Body.Close()
doc, err := html.Parse(resp.Body)
if err != nil {
fmt.Println("http read error")
return
}
var f func(*html.Node)
f = func(n *html.Node) {
if n.Type == html.ElementNode && n.Data == "h3" {
fmt.Println(n.FirstChild.FirstChild.Data)
}
for c := n.FirstChild; c != nil; c = c.NextSibling {
f(c)
}
}
f(doc)
}
Have an update or suggestion for this article? You can edit it here and send me a pull request.