Count number of occurrences of a character in a string

I was looking for a more optimal solution to my approach of counting occurrences of a character in a string in R. And I found this post with the following solution:

countCharOccurrences <- function(char, s) {
s2 <- gsub(char,"",s)
return (nchar(s) - nchar(s2))
}

I don’t see a contact information there to write to the author and suggest my solution, so I’ll put it here:


countCharOccurrences2 <- function(char, s) {
length(strsplit(s, char, fixed=TRUE)[[1]])-1
}

This test shows mine is 5 times faster.


library(microbenchmark)
microbenchmark(countCharOccurrences(":","2:2:00"), countCharOccurrences2(":","2:2:00"), times=10000L)

Unit: microseconds
expr min lq mean median uq max neval
countCharOccurrences(«:», «2:2:00») 13.866 15.326 16.138550 15.690 16.056 1807.277 10000
countCharOccurrences2(«:», «2:2:00») 2.190 3.284 3.940256 4.014 4.380 25.178 10000

I will actually pass the string as the first argument, and fixing «fixed» isn’t ideal, instead … should pass arguments. Except for that, it’s still probably quite an awkward way to do it. What is the proper way?

Also, I needed to count it to now what format to apply when converting string to a time — sometimes timestamp is like «1:00: and sometimes «1:23:00» — depending if it’s more than hour. I count «:» and apply either hms() or ms() from lubridate library. There should be a better way to do this, right?